This message was deleted NebulaGraph #nebula-users

Join Slack

This message was deleted.

# nebula-users

Slackbot

08/30/2022, 7:41 AM

This message was deleted.

Goran Cvijanovic

08/30/2022, 7:51 AM

Must be aware that rebuilding index take some time. You can create index in advance, that will slow down inserting data a little bit, but you will have index up to date. Use SHOW INDEX STATUS and SHOW JOBS and SHOW JOB <id> to verify status of the rebuild process.

Jingchun

08/30/2022, 8:10 AM

Try

SUBMIT JOB STATS

then

SHOW STATS

if you only need to get the numbers of vertices and edges. It's better to use LIMIT and pagination when retrieving large amount of data.

Kasper

08/30/2022, 8:23 AM

Did that, index job was successfully completed.

Kasper

08/30/2022, 8:24 AM

Regarding pagination, you mean using

SKIP

and

LIMIT

, correct? So say if I do

Match ... SKIP 1000000 LIMIT 1000000

, doesn't it still retrieve 2M results and then just discards the first 1M? Then each subsequent page would take more time.

Kasper

08/30/2022, 8:25 AM

lookup

then not the right way to retrieve large numbers of results/vertex ids? From what I am aware you cannot attach a

LIMIT

to a

LOOKUP

query.

Jingchun

08/30/2022, 8:26 AM

You can do that in LOOKUP, such as

LOOKUP ON Post yield id(vertex) | LIMIT 10, 10

Jingchun

08/30/2022, 8:28 AM

you're right about the 2M results in your example.

Jingchun

08/30/2022, 8:28 AM

why do you need to retrieve that much data in a query? could you shed some lights on the scenario? thx.

Kasper

08/30/2022, 8:30 AM

I'm fetching a node-induced subgraph. These vertex ids are my starting points.

Jingchun

08/30/2022, 8:38 AM

i see, so you will need all the 58M Post vertices? In this case will you consider using client sdk instead of queries?

Jingchun

08/30/2022, 8:38 AM

for example, in java client sdk, there's an API to scan vertex.

❤️ 1

Kasper

08/30/2022, 9:25 AM

there's one for python as well, no?

Kasper

08/30/2022, 9:26 AM

so scanning vertices is the preferred way, correct?

Jingchun

08/30/2022, 9:26 AM

yes. you may refer to: https://github.com/vesoft-inc/nebula-python/blob/master/example/ScanVertexEdgeExample.py

Jingchun

08/30/2022, 9:26 AM

Scan will be much much faster

Kasper

08/30/2022, 9:28 AM

I've been using the python client already. Will switch to the scan approach then. Thanks!

Jingchun

08/30/2022, 9:28 AM

welcome 🙂

Goran Cvijanovic

08/30/2022, 10:53 AM

I had a use case to get vertices entered and do with them something, and I set a timestamp property with index and used it to fetch vertices in specific time range, so i pick eg. one minute range and get thousands or even hundreds of thousands vertices entered in that range, then process them with some graph queries. Then get another batch in continues time frame without overlap and be able to process all vertices continuously. Maybe idea can fit your usecase.

❤️ 1

wey

08/31/2022, 12:19 AM

@Kasper for python storage client, please use the master version for now, as there is a bug being fixed on scanning data, a release to pypi will be done later 🙂. Another thing to be noted is, the storage client requires to access metad and storaged directly.

👍 1

6 Views

Open in Slack

Previous Next