LLM之RAG实战(二十一)| 使用LlamaIndex的Text2SQL和RAG的功能分析产品评论

news2024/10/6 12:31:39


    我们可以使用LlamaIndex将SQL与RAG(Retrieval Augmented Generation)相结合来实现。


       为了进行此演示,我们使用GPT-4生成了一个样本数据集,其中包括三种产品的评论:iPhone 13、SamsungTV和Ergonomic Chair。下面是评论示例:

iPhone 13:“Amazing battery life and camera quality. Best iPhone yet.”

SamsungTV:“Impressive picture clarity and vibrant colors. A top-notch TV.”

Ergonomic Chair:“Feels really comfortable even after long hours.”


rows = [    # iPhone13 Reviews    {"category": "Phone", "product_name": "Iphone13", "review": "The iPhone13 is a stellar leap forward. From its sleek design to the crystal-clear display, it screams luxury and functionality. Coupled with the enhanced battery life and an A15 chip, it's clear Apple has once again raised the bar in the smartphone industry."},    {"category": "Phone", "product_name": "Iphone13", "review": "This model brings the brilliance of the ProMotion display, changing the dynamics of screen interaction. The rich colors, smooth transitions, and lag-free experience make daily tasks and gaming absolutely delightful."},    {"category": "Phone", "product_name": "Iphone13", "review": "The 5G capabilities are the true game-changer. Streaming, downloading, or even regular browsing feels like a breeze. It's remarkable how seamless the integration feels, and it's obvious that Apple has invested a lot in refining the experience."},    # SamsungTV Reviews    {"category": "TV", "product_name": "SamsungTV", "review": "Samsung's display technology has always been at the forefront, but with this TV, they've outdone themselves. Every visual is crisp, the colors are vibrant, and the depth of the blacks is simply mesmerizing. The smart features only add to the luxurious viewing experience."},    {"category": "TV", "product_name": "SamsungTV", "review": "This isn't just a TV; it's a centerpiece for the living room. The ultra-slim bezels and the sleek design make it a visual treat even when it's turned off. And when it's on, the 4K resolution delivers a cinematic experience right at home."},    {"category": "TV", "product_name": "SamsungTV", "review": "The sound quality, often an oversight in many TVs, matches the visual prowess. It creates an enveloping atmosphere that's hard to get without an external sound system. Combined with its user-friendly interface, it's the TV I've always dreamt of."},    # Ergonomic Chair Reviews    {"category": "Furniture", "product_name": "Ergonomic Chair", "review": "Shifting to this ergonomic chair was a decision I wish I'd made earlier. Not only does it look sophisticated in its design, but the level of comfort is unparalleled. Long hours at the desk now feel less daunting, and my back is definitely grateful."},    {"category": "Furniture", "product_name": "Ergonomic Chair", "review": "The meticulous craftsmanship of this chair is evident. Every component, from the armrests to the wheels, feels premium. The adjustability features mean I can tailor it to my needs, ensuring optimal posture and comfort throughout the day."},    {"category": "Furniture", "product_name": "Ergonomic Chair", "review": "I was initially drawn to its aesthetic appeal, but the functional benefits have been profound. The breathable material ensures no discomfort even after prolonged use, and the robust build gives me confidence that it's a chair built to last."},]



  • id (Integer, Primary Key)
  • category (String)
  • product_name (String)
  • review (String, Not Null)


engine = create_engine("sqlite:///:memory:")metadata_obj = MetaData()# create product reviews SQL tabletable_name = "product_reviews"city_stats_table = Table(    table_name,    metadata_obj,    Column("id", Integer(), primary_key=True),    Column("category", String(16), primary_key=True),    Column("product_name", Integer),    Column("review", String(16), nullable=False))metadata_obj.create_all(engine)sql_database = SQLDatabase(engine, include_tables=["product_reviews"])for row in rows:    stmt = insert(city_stats_table).values(**row)    with engine.connect() as connection:        cursor = connection.execute(stmt)        connection.commit()




  • 主查询:用自然语言构建主要问题,从SQL表中提取初步数据;
  • 次要查询:构造一个辅助问题,以细化或解释主查询的结果。

2.数据检索:使用Text2SQL LlamaIndex模块运行主查询,以获得初始结果集。




      让我们将其应用于查询“Get the summary of reviews of Iphone13”,系统将生成:

数据库查询:“Retrieve reviews related to iPhone13 from the table.”

解释查询:“Summarize the retrieved reviews.”


def generate_questions(user_query: str) -> List[str]:  system_message = '''  You are given with Postgres table with the following columns.  city_name, population, country, reviews.  Your task is to decompose the given question into the following two questions.  1. Question in natural language that needs to be asked to retrieve results from the table.  2. Question that needs to be asked on the top of the result from the first question to provide the final answer.  Example:  Input:  How is the culture of countries whose population is more than 5000000  Output:  1. Get the reviews of countries whose population is more than 5000000  2. Provide the culture of countries  '''  messages = [      ChatMessage(role="system", content=system_message),      ChatMessage(role="user", content=user_query),  ]  generated_questions = llm.chat(messages).message.content.split('\n')  return generated_questionsuser_query = "Get the summary of reviews of Iphone13"text_to_sql_query, rag_query = generate_questions(user_query)









sql_query_engine = NLSQLTableQueryEngine(    sql_database=sql_database,    tables=["product_reviews"],    synthesize_response=False,    service_context=service_context)



sql_response = sql_query_engine.query(text_to_sql_query)



sql_response_list = ast.literal_eval(sql_response.response)text = [' '.join(t) for t in sql_response_list]text = ' '.join(text)





listindex = ListIndex([Document(text=text)])list_query_engine = listindex.as_query_engine()response = list_query_engine.query(rag_query)print(response.response)


"""Function to perform SQL+RAG"""def sql_rag(user_query: str) -> str:  text_to_sql_query, rag_query = generate_questions(user_query)  sql_response = sql_query_engine.query(text_to_sql_query)  sql_response_list = ast.literal_eval(sql_response.response)  text = [' '.join(t) for t in sql_response_list]  text = ' '.join(text)  listindex = ListIndex([Document(text=text)])  list_query_engine = listindex.as_query_engine()  summary = list_query_engine.query(rag_query)  return summary.response


sql_rag("How is the sentiment of SamsungTV product?")

The sentiment of the reviews for the Samsung TV product is generally positive. Users express satisfaction with the picture clarity, vibrant colors, and stunning picture quality. They appreciate the smart features, user-friendly interface, and easy connectivity options. The sleek design and wall-mounting capability are also praised. The ambient mode, gaming mode, and HDR content are mentioned as standout features. Users find the remote control with voice command convenient and appreciate the regular software updates. However, some users mention that the sound quality could be better and suggest using an external audio system. Overall, the reviews indicate that the Samsung TV is considered a solid investment for quality viewing.

sql_rag("Are people happy with Ergonomic Chair?")

The overall satisfaction of people with the Ergonomic Chair is high.





