Thứ Bảy, 31 tháng 3, 2018

Apache Solr : Binary Field

In solr, binary field use for binary data storage. When you need to optimize your Solr (performance, speed), you should consider to use that field.
When using with BinaryField, the data should be sent/retrieved in as Base64 encoded Strings.
Pay attention to the byte order you use.( BigEndian or LittleEndian)

In my case,

- I use Python to calculate some metrics and generate a vector represented for item, a vector with
8 float numbers. I use base64encode to encode this vector and save it to solr, for example:
value = 5.1
va = bytearray(struct.pack("f", value))
base64.b64encode(va)


The bytearray func use LittleEndian by default, you can replace with BigEndian order like that:
va = bytearray(struct.pack(">f", value))

- I use Java to read the info from BinaryField:
java.nio.ByteBuffer.wrap(bytes).getFloat()


This use BigEndian order by default, you need to change it to same order you use to write to
Solr by:
ByteBuffer.wrap(bytes).order(ByteOrder.BIG_ENDIAN).getFloat()
or 
ByteBuffer.wrap(bytes).order(ByteOrder.LITTLE_ENDIAN).getFloat()




Không có nhận xét nào:

Đăng nhận xét