Hello my graphic card, I wanted to ask you a question

I've lived to see interesting times where on my laptop with RTX I can not only see what graphic wonders the gamedev industry can squeeze out of integrated circuits but also chat with my resources. However strange it sounds...

Chat with RTX is new software from Nvidia (published mid-Feb), thanks to which we can harness our graphics card for "serious" tasks like answering where my notes file is on the topic of this and that. And all of this happens privately, on your hardware, using the Tensor RT, which we enrich with knowledge about our resources.

Installation

The installation file takes up 35GB and requires Windows and an Nvidia card from the 30/40 series with RTX on board.

The installed application takes up less than 30GB.

Chat with RTX disk usage

After starting, a terminal with a running server will open, which we connect to via localhost - on my machine it's on port 11549. I have one model available, Mistral 7B int4.

Chat with RTX main screen

As a Dataset, we can specify a local folder (supported formats are pdf, doc, and txt) or a YouTube URL.

Summarizing a YouTube video

So, I'm doing the first test asking for a summary in 500 keynote slides from SAP Tech Ed 2023:

Chat with RTX YouTube link test

OK, for the summary, AI chose one main topic discussed in the keynote - AI :D. But hey, there was more. I did another test asking for a summary in 2000 words - this time I got more topics - AI and SAP Analytics Cloud. However, other topics are still omitted, which in my opinion should be included in such a summary. Another prompt is a direct request for ALL topics covered in the keynote:

Chat with RTX YouTube link test - summary

Well, great, we have all listed. Now let's ask what Joule is?

Chat with RTX YouTube link test

Finally - will SAP HANA get the Vector Engine and if so - when will it be available?

Chat with RTX YouTube link test

I didn't mean it's being used internally - just when it will be available for customers.

Asked differently:

Chat with RTX YouTube link test

Alright. The prompt matters, in this case I feel like adding external before clients guided towards the goal, because the context for this statement is as follows:

Chat with RTX YouTube link test - context

Asking about knowledge from local files

Alright, enough talking about videos. It's time for some meat in the form of queries in the context of my files. I point to a directory with my purchased PDFs from SAP Press (about 2GB) and the "knowledge acquisition" process begins:

Chat with RTX - embedding generation log

I have the "Performance" mode set, during embedding generation the GPU works at full capacity:

Chat with RTX resource consumption

Generating on my configuration (Intel Core i9 + RTX 4080 (12GB DDR6) + 64GB RAM) took approx. 12 minutes, as a result, alongside the original directory, we have an additional one with "local knowledge" (about 100MB):

Chat with RTX vector embeddings

Then I ask if I have any materials on the SAP tool "BRF+":

Chat with RTX - BRF+ query result

I get a good answer. Now a more specific query - I have information in several books on how to register a service in SAP Gateway:

Chat with RTX - SAP Gateway result

The model selects the book, where there is a very short paragraph titled Register the OData Service - which seems OK, but my internal censor said that the book OData and SAP NetWeaver Gateway should be chosen as the main and more reliable source.

Let's see another question and answer:

Chat with RTX - SAP Gateway query

Again - I would expect more information to be pointed out from the book OData and SAP NetWeaver Gateway. The guiding prompt to this source already did the job:

Chat with RTX - SAP Gateway query

Summary

I feel like the first shots at the chat missed my personal expectations a bit - yet still the answers were valid. But, firstly - I'm not a prompt master, and secondly - Nvidia's software is quite fresh and already looks promising. Searching through a bunch of files to extract some specific information, making a summary about given topic - cool.

From my developer's point of view - I wonder if there will be a local version of the model, which would handle source files of various programming languages - it doesn't even have to be as advanced as CoPilot etc. Even running simpler prompts for finding information would be nice, instead of dealing with advanced search functionalities in IDEs.