<@U01QTMK3457> with the absence of a unique constr...
# nebula
w
@wey with the absence of a unique constraint or index.. is there a good
merge
type query that can try to reduce the possibly of duplicates.. where Nebula does a search before an insert.
g
You mean for Vertices (ond/or Edges) ? If it is for vertices, why not just overwrite the vertex Tag properties, if you don’t need to keep previous record ?
w
as per Goran’s comment, insert will do the override if the instance of vertex exists, for the edge, if rank is not involved(default 0 then), it’s the same situation. Or could you please help give an example to help understand this?
w
this assumes i know the
VID
or that i have a
VID
that is global.
Most systems I deal with the
VID
is some unique ID that is generated.. like snowflake.. I would have really liked to use UPSERT that would have been great. I have component keys as well for the data so its a bit rough.
g
It is fine to have any ID which you can use as unique identifier, so just use it as string VID.
w
Is the limit 256 like indexes? I looked for the restrictions in the docs and didn't see any
g
You should keep it as low as you can, because it will use more storage and memory, and will impact performace. Eg, using 32 byte UUID is good example.
👍 1
w
Yeah I figured some of this, wanting user names and other cve type identifiers to be unique.. but there's others that get long.
g
If you are processing data with your code, you can use hashing for strings, eg. sha1 or sha256 to encode, or you can use Nebula algorithm which is Murmur2Hash but have to be C++ type of implementation, it is supported in different programing languages, so just have to test if hash(YourString) inside Nebula will return same number as your algorithm implementation. What is the advantage of using Nebula algorithm is possibility to execute query to fetch data using simple WHERE hash(YourString) = VertexID