Afleveringen

  • Hey everyone, thank you so much for watching the 95th Weaviate Podcast! We are beyond honored to feature Dai Vu from Google on this one, alongside Weaviate Co-Founder Bob van Luijt! This podcast dives into all things Google Cloud Marketplace and the state of AI. Beginning with the proliferation of Open-Source models and how Dai sees the evolving landscape with respect to things like Gemini Pro 1.5, Gemini Nano and Gemma, as well as the integration of 3rd party model providers such as Llama 3 on Google Cloud platforms such as Vertex AI. Bob and Dai continue to unpack the next move for open-source infrastructure providers and perspectives around "AI-Native" applications, trends in data gravity, perspectives on benchmarking, and Dai's "aha" moment in AI!

  • As you are graduating from ideas to engineering, one of the key concepts to be aware of is Parallel Computing and Concurrency. I am SUPER excited to share our 94th Weaviate podcast with Magdalen Dobson Manohar! Magdalen is one of the most impressive scientists I have ever met, having completed her undergraduate studies at MIT before joining Carnegie Mellon University to study Approximate Nearest Neighbor Search and develop ParlayANN. ParlayANN is one of the most enlightening works I have come across that studies how to build ANN indexes in parallel without the use of locking.

    In my opinion, this is the most insightful podcast we have ever produced into Vector Search, the core technology behind Vector Databases. The podcast begins with Magdalen’s journey into ANN science, the issue of Lock Contention in HNSW, further detailing HNSW vs. DiskANN vs. HCNNG and pyNNDescent, ParlayIVF, how Parallel Index Construction is achieved, conclusions from experimentation, Filtered Vector Search, Out of Distribution Vector Search, and exciting directions for the future!

    I also want to give a huge thanks to Etienne Dilocker, John Trengrove, Abdel Rodriguez, Asdine El Hrychy, and Zain Hasan. There is no way I would be able to keep up with conversations like this without their leadership and collaboration.

    I hope you find the podcast interesting and useful!

  • Zijn er afleveringen die ontbreken?

    Klik hier om de feed te vernieuwen.

  • Hey everyone! I am SUPER excited to publish our newest Weaviate podcast with Kyle Davis, the creator of RAGKit! At a high-level, the podcast covers our understanding of RAG systems through 4 key areas: (1) Ingest / ETL, (2) Search, (3) Generate / Agents, and (4) Evaluation. Discussing these lead to all sorts of topics from Knowledge Graph RAG, to Function Calling and Tool Selection, Re-ranking, Quantization, and many more!This discussion forced me to re-think many of my previously held beliefs about the current RAG stack, particularly the definition of “Agents”. I came in believing that the best way of viewing “Agents” is an abstraction on top of multiple pipelines, such as an “Email Agent”, but Kyle presented the idea of looking at “Agents” as scoping the tools each LLM call is connected to, such as `read_email` or `calculator`. Would love to know what people think about this one, as I think getting a consensus definition of “Agents” can clarify a lot of the current confusion for people building with LLMs / Generative AI.

  • I've seen a lot of interest around RAG for X application domain, Legal, Accounting, Healthcare, .... David and Kevin are maybe the best example of this I have seen so far, pivoting from Neum AI to VetRec!

    We begin the podcast by discussing the decision to switch gears, the advice given by Y Combinator, and David's experience in learning a new application domain.

    We then continue to discuss technical opportunities around RAG for Veterinarians, such as SOAP notes and Differential Diagnosis!

    We conclude with David's thoughts on the ETL space, companies like Unstructured and LlamaIndex's LlamaParse, advice for specific focus in ETL, and general discussions of ETL for Vector DBs / KGs / SQL.

    David and Kevin have been two of my favorite entrepreneurs I've met during my time at Weaviate! They do an amazing job of writing content that helps you live vicariously through them as they take on this opportunity to apply RAG and AI technologies to help Veterinarians!

    I really hope you enjoy the podcast!

  • Voyage AI is the newest giant in the embedding, reranking, and search model game!

    I am SUPER excited to publish our latest Weaviate podcast with Tengyu Ma, Co-Founder of Voyage AI and Assistant Professor at Stanford University!

    We began the podcast with a deep dive into everything embedding model training and contrastive learning theory. Tengyu delivered a masterclass in everything from scaling laws to multi-vector representations, neural architectures, representation collapse, data augmentation, semantic similarity, and more! I am beyond impressed with Tengyu's extensive knowledge and explanations of all these topics.

    The next chapter dives into a case study Voyage AI did fine-tuning an embedding model for the LangChain documentation. This is an absolutely fascinating example of the role of continual fine-tuning with very new concepts (for example, very few people were talking about chaining together LLM calls 2 years ago), as well as the data efficiency advances in fine-tuning.

    We concluded by discussing ML systems challenges in serving an embeddings API. Particularly the challenge of detecting if a request is for batch or query inference and the optimizations that go into either say ~100ms latency for a query embedding or maximizing throughput for batch embeddings.

  • One of the core values of DSPy is the ability to add “reasoning modules” such as Chain-of-Thought to your LLM programs!

    For example, Chain-of-Thought describes prompting the LLM with “Let’s think step by step …”. Interestingly, this meta-prompt around asking the LLM to think this way dramatically improves performance in tasks like question answering or document summarization.

    Self-Discover is a meta-prompting technique that searches for the optimal thinking primitives to integrate into your program! For example, you could “Let’s think out of the box to arrive at a creative solution” or “Please explain your answer in 4 levels of abstraction: as if you are talking to a five year old, a high school student, a college student studying Computer Science, and a software engineer with years of experience in the topic”.

    I am SUPER excited to be publishing our 90th Weaviate Podcast with Chris Dossman! Chris has implemented Self-Discover in DSPy, one of the most fascinating examples so far of what the DSPy framework is capable of!

    Chris is also one of the most talented entrepreneurs I have met during my time at Weaviate thanks to introductions from Bob van Luijt and Byron Voorbach. Chris built one of the earliest RAG systems for government information and is now working on LLM opportunities in marketing with his new startup, Dicer.ai!

    I hope you enjoy the podcast, it was such a fun one and I learned so much!

  • Hey everyone! Thank you so much for watching the 89th Weaviate Podcast on Matryoshka Representation Learning! I am beyond grateful to be joined by the lead author of Matryoshka Representation Learning, Aditya Kusupati, Zach Nussbaum, a Machine Learning Engineer at Nomic AI bringing these embeddings to production, and my Weaviate colleague, Zain Hasan, who has done amazing research on Matryoshka Embeddings! We think this is a super powerful development for Vector Search! This podcast covers all sorts of details from generally what Matryoshka embeddings are, the challenges of training them, experiences building an embeddings API product from Nomic AI and how it ties with Nomic Atlas, Aditya's research on differentiable ANN indexes, and many more! This was such a fun one, I really hope you find it useful! Please let us know what you think!

  • Jason Liu is the creator of Instructor, one of the world's leading LLM frameworks, particularly focused on structured output parsing with LLMs, or as Jason puts it "making LLMs more backwards compatible". It is hard to understand the impact of Instructor, this is truly leading us to the next era of LLM programming. It was such an honor chatting with Jason, his experience currently as an independent consultant and previously engineering at StitchFix and Meta makes him truly one of the most unique guests we have featured on the Weaviate podcast! I hope you enjoy the podcast!

  • Hey everyone! Thank you so much for watching the 87th episode of the Weaviate Podcast! I am SUPER excited to welcome Karel D'Oosterlinck! Karel is the creator of IReRa (Infer-Retrieve-Rank)! IReRa is one of the most impressive systems that have been built for Extreme Multi-Label Classification, leveraging the emerging paradigm of DSPy compilation! This podcast dives into all things IReRa, XMC, DSPy compilation, and applications in Biomedical NLP and Recommendation! I hope you find this useful!

  • Hey everyone! We are super excited to publish this podcast with Vinod Valloppillil and Bob van Luijt on Open-Source AI and future directions for RAG! The podcast begins by discussing Vinod's "Halloween Documents", a series of internal strategy writings at Microsoft related to the open-source software movement! The conversation continues to discuss the current state of Open-Source in AI. One of the major points Bob has been making about the business of AI models is that the models themselves are *stateless*, akin to an MP3 file. Vinod pushes back a bit on this definition and jointly it is then settled that these models neither fall into the pure stateful or stateless bucket, rather a "pre-baked" bucket -- presenting completely new opportunities to build business around software. The conversation then continues to discuss the particular details of how people are building RAG systems and many directions for how that may evolve!

  • Hey everyone! I am beyond excited to present our interview with Omar Khattab from Stanford University! Omar is one of the world's leading scientists on AI and NLP. I highly recommend you check out Omar's remarkable list of publications linked below! This interview completely transformed my understanding of building RAG and LLM applications! I believe that DSPy will be one of the most impactful software project in LLM development because of the abstractions around *program optimization*. Here is my TLDR of this concept of LLM programs and program optimization with DSPy, I of course encourage you to view the podcast and listen to Omar's explanation haha.RAG is one of the most popular LLM programs we have seen. RAG typically consists of two components of retrieve and then generate. Within the generate component we have a prompt like "please ground your answer based on the search results {search_results}". DSPy gives us a framework to optimize this prompt, bootstrap few-shot examples, or even fine-tune the model if needed. This works by compiling the program based on some evaluation criteria we give DSPy. Now let's say we add a query re-writer that takes the query and writes a new query before sending it to the retrieval system, and a reranker that takes the search results and re-orders them before handing them to the answer generator. Now we have 4 components of query writer, retrieve, rerank, answer. The 3 components of query writer, rerank, and answer all have a prompt that can be optimized with DSPy to enhance the description of the task or add examples! This optimization is done with DSPy's Teleprompters.There are a few other really interesting components to DSPy as well -- such as the formatting of prompts with the docstrings and Signature abstraction, which in my view is quite similar to instructor or LMQL. DSPy also comes with built-in prompts like Chain-of-Thought that offer a really quick way to add this reasoning step and follow a structured output format. I am having so much fun learning about DSPy and I highly recommend you join me in viewing the GitHub repository linked below (with new examples!!):Omar also discusses ColBERT and late interaction retrieval! Omar describes how this achieves the contextualized attention of cross encoders but in a much more scalable system with the maximum similarity between vectors! Stay tuned for more updates from Weaviate as we are diving into multi vector representations to hopefully support systems like this soon!

    Chapters

    0:00 Weaviate at NeurIPS 2023!

    0:38 Omar Khattab

    0:57 What is the state of AI?

    2:35 DSPy

    10:37 Pipelines

    14:24 Prompt Tuning and Optimization

    18:12 Models for Specific Tasks

    21:44 LLM Compiler

    23:32 Colbert or ColBERT?

    24:02 ColBERT

  • Hey everyone! Thank you so much for watching the fourth and final episode of the AI-Native Database series with Dan Shipper! This was another epic one! Dan has had an absolutely remarkable career creating and selling a company and now co-founding and working as the CEO of Every! Every is an incredibly future-looking business focused on content online, both with an amazing newsletter, community of writers and thinkers, an AI-note taking app, and more! I think Dan brings a very unique perspective to the series, as well as the Weaviate podcast broadly, because of his experience with writers and understanding how writers are going to use these new technologies! We heavily discussed the role of personality or subjectivity in AI, amongst many other topics! I really hope you enjoy the podcast, as always we are more than happy to answer any questions or discuss any ideas you have about the content in the podcast!Read writings from Dan Shipper on Every: https://every.to/@danshipperChapters0:00 AI-Native Databases0:58 Welcome Dan Shipper!1:37 GPT-4 is a Reasoning Engine8:40 Subjectivity in LLMs12:14 AI in Note Taking16:38 The opinions of LLMs25:50 Cookbooks for you31:16 Overdrive in LLMs34:50 Tweaking the voice of AI40:45 Multi-Agent Personalities

  • Hey everyone! Thank you so much for watching the 3rd episode of the AI-Native Database series featuring John Maeda and Bob van Luijt! This one dives into how humans perceive AI, from Anthroaormorphization to Doomsday scenario thinking and how important understanding how AI actually work is to the engineering of these systems. Bob and John discuss the evolution of the design in tech report, 3 categories of design, and many others! I hope you enjoy the podcast! As always, we are more than happy to answer any questions or discuss any ideas you have about the content in the podcast!Links:Design in Tech Report: https://designintech.report/3 Kinds of Design: https://qz.com/1585165/john-maeda-on-the-importance-of-computational-designMicrosoft Semantic Kernel: https://github.com/microsoft/semantic-kernelChapters0:00 AI-Native Databases0:58 Welcome John Maeda!1:35 Design in Tech Report4:07 Anthropomorphizing AI15:30 3 Types of Design19:30 The ChatGPT Shift22:58 Explaining Technology32:54 Impact of AI on the Creative Industries39:00 Semantic Kernel

  • Hey everyone! Thank you so much for watching the second episode of AI-Native Databases with Paul Groth! This was another epic one, diving deep into the role of structure in our data! Beginning with Knowledge Graphs and LLMs, there are two perspectives: LLMs for Knowledge Graphs (using LLMs to extract relationships or predict missing links) and then Knowledge Graph for LLMs (to provide factual information in RAG). There is another intersection that sits in the middle of both LLMs for KGs and KGs for LLMs, which is using LLMs to query Knowledge Graphs, e.g. Text-to-Cypher/SPARQL/... From there I think the conversation evolves in a really fascinating way exploring the ability to structure data on-the-fly. Paul says "Unstructured data is now becoming a peer to structured data"! I think in addition to RAG, Generative Search is another underrated use case -- where we use LLMs to summarize search results or parse out the structure. Super interesting ideas, I hope you enjoy the podcast -- as always more than happy to answer any questions or discuss any ideas you have about the content in the podcast!Learn more about Professor Groth's research here: https://scholar.google.com/citations?...Knowledge Engineering using Large Language Models: https://arxiv.org/pdf/2310.00637.pdfHow Much Knowledge Can You Pack into the Parameters of a Language Model? https://arxiv.org/abs/2002.08910Chapters0:00 AI-Native Databases!0:58 Welcome Paul!1:25 Bob’s overview of the series2:30 How do we build great datasets?4:28 Defining Knowledge Graphs7:15 LLM as a Knowledge Graph15:18 Adding CRUD Support to Models28:10 Database of Model Weights32:50 Structuring Data On-the-Fly

  • Hey everyone! Thank you so much for watching the first episode of AI-Native Databases with Andy Pavlo! This was an epic one! We began by explaining the "Self-Driving Database" and all the opportunities to optimize DBs with AI and ML at both the low-level, as well as how we query and interact with them. We also discussed new opportunities with DBs + LLMs, such as bringing the data to the model (such as ROME, MEMIT, GRACE), in addition to bringing the model to the data (such as RAG). We also discuss the subjective "opinion" of these models and many more!I hope you enjoy the podcast! As always we are more than happy to answer any questions or discuss any ideas you have about the content in the podcast! This one means a lot to me. Andy Pavlo's CMU DB course was one of the most impactful resources in my personal education, and I love the vision for the future outlined by OtterTune! It was amazing to see Etienne Dilocker featured in the ML for DBs, DBs for ML series at CMU. I am so grateful to Andy for joining the Weaviate Podcast!Links:CMU Database Group on YouTube: https://www.youtube.com/@CMUDatabaseGroup/videosSelf-Driving Database Management Systems - Pavlo et al. - https://db.cs.cmu.edu/papers/2017/p42-pavlo-cidr17.pdfDatabase of Databases: https://dbdb.io/Generative Feedback Loops: https://weaviate.io/blog/generative-feedback-loops-with-llmsWeaviate Gorilla: https://weaviate.io/blog/weaviate-gorilla-part-1Chapters0:00 AI-Native Databases0:58 Welcome Andy1:58 Bob’s overview of the series3:20 Self-Driving Databases8:18 Why isn’t there just 1 Database?12:46 Collaboration of Models and Databases20:05 LLM Schema Tuning23:44 The Opinion of the System28:20 PyTorchDB - Moving the Data to the Model33:30 Database APIs38:15 Learning to operate Databases42:54 Vector DBs and the DB Hype Cycle51:38 SQL in Weaviate? 1:07:40 The Future of DBs1:14:00 Thank you Andy!

  • Hey everyone! Thank you so much for watching the Weaviate 1.23 Release Podcast with Weaviate Co-Founder and CTO Etienne Dilocker! Weaviate 1.23 is a massive step forward for managing multi-tenancy with vector databases. For most RAG and Vector DB applications, you will have an uneven distribution in the # of vectors per user. Some users have 10k docs, others 10M+! Weaviate now offers a flat index with binary quantization to efficiently balance when you need an HNSW graph for the 10M doc users and when brute force is all you need for the 10k doc users!Weaviate also comes with some other "self-driving database" features like lazy shard loading for faster startup times with multi-tenancy and automatic resource limiting with the GOMEMLIMIT and other details Etienne shares in the podcast!I am also beyond excited to present our new integration with Anyscale (@anyscalecompute)! Anyscale has amazing pricing for serving and fine-tuning popular open-source LLMs. At the time of this release we are now integrating the Llama 70B/13B/7B, Mistral 7B, and Code Llama 34B into Weaviate -- but we expect much further development with adding support for fine-tuned models, the super cool new function calling models Anyscale announced yesterday. and other model such as Diffusion and multimodal models!Chapters0:00 Weaviate 1.231:08 Lazy Shard Loading8:20 Flat Index + BQ33:15 Default Segments for PQ38:55 AutoPQ42:20 Auto Resource Limiting46:04 Node Endpoint Update47:25 Generative AnyscaleLinks:Etienne Dilocker on Native Multi-Tenancy at the AI Conference in SF:https://www.youtube.com/watch?v=KT2RFMTJKGsEtienne Dilocker in the CMU DB Series: https://www.youtube.com/watch?v=4sLJapXEPd4Self-Driving Databases by Andy Pavlo: https://www.cs.cmu.edu/~pavlo/blog/2018/04/what-is-a-self-driving-database-management-system.html

  • Hey everyone! Thank you so much for watching the 78th episode of the Weaviate podcast featuring Rudy Lai, the founder and CEO of Tactic Generate! Tactic Generate has developed a user experience around applying LLMs in parallel to multiple documents, or even folders / collections / databases. Rudy discussed the user research that lead the company to this direction and how he sees the opportunities in building AI products with new LLM and Vector Database technologies! I hope you enjoy the podcast, as always more than happy to answer any questions or discuss any ideas you have about the content in the podcast!Learn more about Tactic Generate here: https://tactic.fyi/generative-insights/Weaviate Podcast #69 with Charles Pierse: https://www.youtube.com/watch?v=L_nyz1xs9AUChapters0:00 Welcome Rudy!0:48 Story of Tactic Generate7:45 Finding Common Workflows19:30 Multiple Document RAG UIs26:14 Parallel LLM Execution32:40 Aggregating Parallel LLM Analysis38:25 Pretty Reports44:28 Research Agents

  • Hey everyone, thank you so much for watching the 77th Weaviate Podcast on RAGAS, featuring Jithin James, Shahul ES, and Erika Cardenas! RAGAS is one of the hottest rising startups in Retrieval-Augmented Generation! RAGAS began it's journey with the RAGAS score, a matrix of evaluations for generation and retrieval. Generation evaluated on Faithfulness (is the response grounded in the context) as well as Relevancy (is the response useful). Retrieval is then evaluated on Precision (How many of the search results are relevant to the question?) and Recall (How many of the relevant search results are captured in the retrieved results?). Now, the super novel thing about this is that an LLM is used to determine these metrics. So we circumvent painstaking manual labeling effort with the RAGAS score! This podcast dives into the development of the RAGAS score as well as how RAG application builders should think about the knobs to tune for optimizing their RAGAS score: embedding models, chunking strategies, hybrid search tuning, rerankers, ... ?!? We also discussed tons of exciting directions for the future such as fine-tuning smaller LLMs for these metrics, agents that use tuning APIs, and long context RAG!Check out the docs here for getting started with RAGAS! https://docs.ragas.io/en/latest/getstarted/index.html#get-startedChapters0:00 Welcome Jithin and Shahul!0:44 Welcome Erika!0:56 RAGAS, Founding Story2:38 Weaviate + RAGAS integration plans4:44 RAG Knobs to Tune25:50 RAG Experiment Tracking34:52 LangSmith and RAGAS38:55 LLM Evaluation40:25 RAGAS Agents44:00 Long Context RAG Evaluation

  • Hey everyone, I am SUPER excited to present our 76th Weaviate Podcast featuring Patrick Lewis, an NLP Research Scientist at Cohere! Patrick has had an absolutely massive impact on Natural Language Processing with AI and Deep Learning! Especially notable for the current climate in AI and Weaviate is that Patrick is the lead author of the original "Retrieval-Augmented Generation" paper!! Patrick has contributed to many other profoundly impactful papers in the space as well such as DPR, Atlas, Task-Aware Retrieval with Instruction, and many many others! This was such an illuminating conversation, here is a quick overview of the chapters in the podcast!1. Origin of RAG - Patrick explains the build-up that lead to the RAG paper, AskJeeves, IBM Watson, conceptual shift to retrieve-read in mainstream connectionist approaches to AI.2. Atlas - Atlas shows that a much smaller LLM when paired with Retrieval-Augmentation can still achieve competitive few-shot and zero-shot task performance. This is super impactful because this few-shot and zero-shot capability has been a massive evangelist for AI broadly, and the fact that smaller Retrieval-Augmented models can do this is massive for the economically unlocking these applications.Teasing apart some architectural details of RAG:3. Fusion In-Decoder - Interesting encoder-decoder transformer design in which each document + the query is encoded separately, then concatenated and passed to the LM.4. End-to-End RAG - How to think about jointly training an embedding model and an LLM augmented with retrieval?5. Query Routers - How to route queries from say SQL or Vector DBs? (More nuance on this later with Multi-Index Retrieval)6. ConcurrentQA - Super interesting work on the privacy of multi-index routers. For example, if you ask "Who is the father of our new CEO" - this may reveal the private information of the new CEO with the public query of their father.7. Multi-Index Retrieval8. New APIs for LLMs9. Self-Instructed Gorillas10. Task-Aware Retrieval with Instructions11. Editing Text, EditEval and PEER12. What future direction excites you the most?Links:Learn more about Patrick Lewis: https://www.patricklewis.io/Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks: https://arxiv.org/abs/2005.11401Atlas: https://arxiv.org/pdf/2208.03299.pdfFusion In-Decoder: https://arxiv.org/pdf/2007.01282.pdfChapters0:00 Welcome Patrick Lewis!0:36 Origin of RAG5:20 Atlas10:43 Fusion In-Decoder17:50 End-to-End RAG27:05 Query Routers32:05 ConcurrentQA37:30 Multi-Index Retrieval40:05 New APIs for LLMs41:50 Self-Instructed Gorillas44:35 Task-Aware Retrieval with Instructions52:00 Editing Text, EditEval and PEER55:35 What future direction excites you the most?

  • Hey everyone! Thank you so much for watching the 75th Weaviate Podcast featuring Tanmay Chopra! The podcast details Tanmay's incredible career in Machine Learning from Tik Tok to Neeva and now building his own startup, Emissary! Tanmay shared some amazing insights into Search AI such as how to process Temporal Queries, how to think about diversity in Retrieval, and Query Recommendation products! We then dove into the opportunity Tanmay sees in fine-tuning LLMs and knowledge distillation that motivated Tanmay to build Emissary! I thought Tanmay's analogy of GPT-4 to 3D printers was really interesting, tons of great nuggets in here! I really hope you enjoy the podcast, as always more than happy to answer any questions or discuss any ideas with you related to the content in the podcast!Chapters0:00 Welcome Tanmay!0:23 Early Career Story2:02 Tik Tok4:10 Neeva8:45 Temporal Queries11:40 Retrieval Diversity17:22 Query Recommendation23:20 Emissary, starting a company!30:20 A Simple API for Custom Models35:42 GPT-4 = 3D Printer?