Discussion about this post

User's avatar
Stephen Fitzpatrick's avatar

One of the most misleading things about AI and context windows is precisely what you are describing here. Many commentators make claims (as do the companies) that you can put entire books (say, a 300 page PDF) into something like NotebookLM, and utilize that as a "source" from which to generate content. Unfortunately, as you note, this doesn't work - other people have described exactly what you discuss here - the importance of chunking. I've found that anything more than a 10 -15 page dense selection of text will miss things. It's great for including shorter research articles, but anything more needs to be handled differently. I teach a research class and we've utilized NotebookLM but it really needs to be explained clearly to be effective. Lots of students are not maximizing the potential so there definitely needs to be more of this kind of explanation and training.

Rainbow Roxy's avatar

Excellent analysis of how document chunking shapes findabilty. What if the system itself could learn to dynamically adjust the meaning space granularity based on the user's evovling intent?

No posts

Ready for more?