mark_l_watson 4 days ago | next |

I was fortunate to be hired as a contractor 11 years ago to work on an internal Google Knowledge Graph application. Google is just one of many large companies to utilize one huge graph to localize information from many sources.

I bought into TBL’s Semantic Web ideas (and I cover ‘lower case’ semantic web topics in a few of my books). I think it is a shame that publicly accessible world knowledge graphs never really took off, but at least Google’s Data Commons is available for free for non-commercial, educational, and research uses.

ramraj07 4 days ago | root | parent | next |

Could you provide any insight into why KGs work well in such places? Like a contrived example maybe?

mark_l_watson 4 days ago | root | parent |

Well, from public information: Meta has a huge social graph that helps support their social media businesses, and Google has a wealth of real world knowledge. These graphs are, I think, optimized for super fast 1 millisecond level query latencies, and not a rich query language (for example, not something like SPARQL).

I have just been looking at the Data Commons data sets, and I think I will add a fun example to my live Common Lisp eBook (and/or my Racket live eBook).

zozbot234 3 days ago | root | parent | prev |

> publicly accessible world knowledge graphs never took off

Huh? What's wrong with Wikidata and the Linked Open Data cloud? These seem quite real to me.

mark_l_watson 3 days ago | root | parent |

Well they didn’t ‘take off’ enough for me. I talked with TBL one time about the success of the Semantic Web, and I don’t want to misquote him, but I felt like he shared my view: great tech, but not used enough.

I personally use WikiData and DBPedia, and I have for years.

westurner 4 days ago | prev | next |

> Retrieval Interleaved Generation (RIG): This approach fine-tunes Gemma 2 to identify statistics within its responses and annotate them with a call to Data Commons, including a relevant query and the model's initial answer for comparison. Think of it as the model double-checking its work against a trusted source.

> [...] Trade-offs of the RAG approach: [...] In addition, the effectiveness of grounding depends on the quality of the generated queries to Data Commons.

Groxx 4 days ago | root | parent |

Gotta say, this kinda feels like "giving it correct data didn't work, so what if we did that twice?".

Like, two layers of duct tape are better than one.

Seems reasonable and I can believe it helps, just also seems like it doesn't do much to improve confidence in the system as a whole. Particularly since they're basically asking it to find things worth checking, then have it write the checking query, and have it interpret the results. When it's the thing that screwed it up in the first place.

vineyardmike 4 days ago | root | parent | next |

eh, I think this is pretty reasonable and not a "hack". It matches what we do as people. I think there probably needs to be research into how to tell it when it doesn't know something, however.

I think if you remember that LLMs are not databases, but they do contain a super lossy-compressed version of (it's training) knowledge, this feels less like a hack. If you ask someone, "who won the World Cup in 2000?", they may say "I think it was X, but let me google it first". That person isn't screwed up, using tools isn't a failure.

If the context is a work setting, or somewhere that is data-centric, it totally makes sense to check it. Like a Chat Bot for a store, or company that is helping someone troubleshoot or research. Anything where it really obvious answers that are easy to learn from volumes of data ("what company makes the corolla?"), probably don't need fact checking as often, but why not have the system check its work?

Meanwhile, programming, writing prose, etc are not things you generally fact-check mid-way, and are things that can be "learned" well from statistical volume. Most programmers can get "pretty good" syntax on first try, and any dedicated syntax tool will get to basically 100%, and the same makes sense for an LLM.

westurner 4 days ago | root | parent | next |

This is similar to the difference between data dredging and scientific hypothesis testing.

'But what bias did we infer with [LLM knowledgebase] background research prior to formulating a hypothesis, and who measured?'

There are various methods of Inference: Inductive, Deductive, and Abductive

What are the limits of Null Hypothesis methods of scientific inquiry?

taneq 4 days ago | root | parent | prev |

> Like, two layers of duct tape are better than one.

Uh... they are? They're not better than a properly specc'd fastener installed at appropriately engineered mounting points but still better than one layer of duct tape, let alone none.

openrisk 4 days ago | prev | next |

The public / non-government sector (especially in Europe) has been quite keen for decades in (linked) open data, knowledge graphs and associated technologies. Yet applications, tools and, ultimately usability, awareness and adoption have been lagging.

In this sense this project offers a remarkable, albeit implicit, endorsement of that broader open data space, as it comes from a major private sector entity and links with the hot LLM technology of the day.

At high level though, this design seems to violate the "bitter lesson" gospel [1].

> 1) AI researchers have often tried to build knowledge into their agents

Which is at the same refreshing (as there is something very incomplete and self-defeating in the "scaling" hypothesis) and hints at the difficulties of meaningfully integrating very heterogeneous sources and representations of information.

[1] http://www.incompleteideas.net/IncIdeas/BitterLesson.html

zrank 4 days ago | prev | next |

So, essentially, you can Google the answer in the first place, click on your own trusted sources and compare them. All without using a language model.

Or buy an encyclopedia ...

amelius 4 days ago | prev |

Information isn't the only problem. Another problem is the correct application of logic.