Technical Case Study: Graffinity鈥檚 technical evolution - Advancing search and knowledge graphs | 果冻影院 Centre for Digital Innovation

Technical Case Study: Graffinity鈥檚 technical evolution - Advancing search and knowledge graphs

6 June 2024

Graffinity, an EdTech startup leveraging machine learning to create searchable mind maps and accelerate comprehension for learners disadvantaged by large volumes of text, joined Cohort 4 of the 果冻影院 CDI Impact Accelerator to enhance the technical aspects of their digital innovation. Through participation in the Technical Needs Assessment workshop, they pinpointed three critical areas for improvement to focus on during the programme: revolutionising search capabilities, boosting the quality of their Knowledge Graph (KG), and refining the platform鈥檚 design. Their journey was marked by challenges, experimentation, and breakthroughs, culminating in significant technical advancements.聽

Revolutionising Search Functionality聽

Graffinity identified that their search functionality was underperforming in three key areas, necessitating improvements for a better user experience.聽

1. Addressing Misspellings: The system was intolerant of misspellings, requiring a "fuzzier" approach to accommodate user typos while still functioning effectively.聽

2. Supporting Concept Searches: Searches were limited to single entities (e.g., the name of a person or company) that had to exist within the Knowledge Graph (KG). They aimed to support searches on concepts even if these weren't directly represented in the KG.聽

3. Enhancing Relevance: The system often returned irrelevant related entities, highlighting the need to enhance the relevance of the results. They sought a solution that was both robust and intuitive.聽

Initially, the team turned to Amazon Kendra for its powerful semantic search. However, integrating Kendra with their KG proved unfeasible. As Kendra couldn鈥檛 directly interact with the KG, Graffinity had to rethink their approach. Facing a roadblock, Graffinity explored the potential of Large Language Models (LLMs). They considered a Retrieval Augmented Generation (RAG) approach but encountered a catch-22: to implement RAG, they first needed the very improvements they were seeking to get from it. This led them to a novel strategy鈥擫LM augmented search.聽

Using Anthropic鈥檚 Claude 2 foundation model via AWS Bedrock, Graffinity worked with data scientists from ARC to devise a system where user queries were fed to the LLM to identify relevant entities. This information was then used to pull additional data from the KG. The results were astonishing. For instance, a user inputting 鈥淢ab ceqth Shbajelaspear鈥� would receive a graph displaying Macbeth and key characters from Shakespeare鈥檚 play, showcasing the system鈥檚 ability to handle misspellings and contextual searches seamlessly. Concept searches, such as 鈥渨hat is the historical context for Macbeth,鈥� yielded comprehensive graphs connecting play entities with relevant historical figures from Scottish history.聽

Enhancing the Knowledge Graph聽

With search capabilities on a new trajectory, Graffinity turned to improving the KG itself. The goal was to extract entities and relationships from unstructured text to enrich their existing data.聽

Here, they explored AWS Comprehend NLP鈥檚 beta features for relationship extraction. While promising, it fell short in entity disambiguation鈥攃rucial for their needs. Returning to Bedrock and Claude 2, they leveraged the LLM to extract entities and relationships from text. This approach not only worked but excelled. A proof-of-concept allowed for website scraping to generate KGs, with plans to enable user-uploaded text integration. By utilizing Amazon Transcribe, Graffinity envisioned a future where video and audio files could also be converted into rich KGs.聽

Optimising Platform Design聽

Graffinity鈥檚 Neo4J graph database was running in a single container within their Amazon EKS cluster, lacking persistent storage and scalability. They needed a more robust solution.聽

AWS Neptune presented the answer. Supporting the Open Cypher query language, Neptune allowed Graffinity to create a serverless cluster, loading their graph database from the same CSV files used for Neo4J. Their existing Cypher queries worked without modification, seamlessly transitioning their web application to a more scalable and enterprise-grade database environment.聽

A Technical Triumph聽

聽Graffinity鈥檚 technical journey on the 果冻影院 CDI Accelerator was transformative. By overcoming search limitations, enhancing their KG, and optimising their database design, they not only improved user experience but set a new standard for their platform鈥檚 capabilities. Through innovation and resilience, Graffinity emerged stronger, ready to tackle future challenges and seize new opportunities.聽