Close Menu
  • Home
  • Opinion
  • Region
    • Africa
    • Asia
    • Europe
    • Middle East
    • North America
    • Oceania
    • South America
  • AI & Machine Learning
  • Robotics & Automation
  • Space & Deep Tech
  • Web3 & Digital Economies
  • Climate & Sustainability Tech
  • Biotech & Future Health
  • Mobility & Smart Cities
  • Global Tech Pulse
  • Cybersecurity & Digital Rights
  • Future of Work & Education
  • Trend Radar & Startup Watch
  • Creator Economy & Culture
What's Hot

Fintechs in Kenya and Rwanda might quickly function underneath one licence

March 12, 2026

Irish unicorn Tines creating 100 jobs within the US

March 12, 2026

RayNeo X3 Professional Integrates Amap Providers, Bringing “Service-Finds-Person” Expertise to AR Glasses

March 12, 2026
Facebook X (Twitter) Instagram LinkedIn RSS
NextTech NewsNextTech News
Facebook X (Twitter) Instagram LinkedIn RSS
  • Home
  • Africa
  • Asia
  • Europe
  • Middle East
  • North America
  • Oceania
  • South America
  • Opinion
Trending
  • Fintechs in Kenya and Rwanda might quickly function underneath one licence
  • Irish unicorn Tines creating 100 jobs within the US
  • RayNeo X3 Professional Integrates Amap Providers, Bringing “Service-Finds-Person” Expertise to AR Glasses
  • 👨🏿‍🚀TechCabal Day by day – Will your inDriver get medical health insurance?
  • Metropolis (1927) Created The Blueprint For Trendy Science Fiction Worlds
  • UAE Residents Flip to Staycations for Eid as Wego Sees Surge in Resort Searches
  • Hong Kong and Shanghai Collaborate on Blockchain Cargo Knowledge Initiative
  • U of T to accomplice with India on well being AI
Thursday, March 12
NextTech NewsNextTech News
Home - AI & Machine Learning - Getting Began with Mirascope: Eradicating Semantic Duplicates utilizing an LLM
AI & Machine Learning

Getting Began with Mirascope: Eradicating Semantic Duplicates utilizing an LLM

NextTechBy NextTechJuly 17, 2025No Comments5 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Telegram Email Copy Link
Follow Us
Google News Flipboard
Getting Began with Mirascope: Eradicating Semantic Duplicates utilizing an LLM
Share
Facebook Twitter LinkedIn Pinterest Email


Mirascope is a robust and user-friendly library that gives a unified interface for working with a variety of Massive Language Mannequin (LLM) suppliers, together with OpenAI, Anthropic, Mistral, Google (Gemini and Vertex AI), Groq, Cohere, LiteLLM, Azure AI, and Amazon Bedrock. It simplifies every part from textual content era and structured knowledge extraction to constructing complicated AI-powered workflows and agent methods.

On this information, we’ll give attention to utilizing Mirascope’s OpenAI integration to determine and take away semantic duplicates (entries which will differ in wording however carry the identical that means) from an inventory of buyer opinions. 

Putting in the dependencies

pip set up "mirascope[openai]"

OpenAI Key

To get an OpenAI API key, go to https://platform.openai.com/settings/group/api-keys and generate a brand new key. For those who’re a brand new person, you might want so as to add billing particulars and make a minimal fee of $5 to activate API entry.

import os
from getpass import getpass
os.environ['OPENAI_API_KEY'] = getpass('Enter OpenAI API Key: ')

Defining the listing of buyer opinions

customer_reviews = [
    "Sound quality is amazing!",
    "Audio is crystal clear and very immersive.",
    "Incredible sound, especially the bass response.",
    "Battery doesn't last as advertised.",
    "Needs charging too often.",
    "Battery drains quickly -- not ideal for travel.",
    "Setup was super easy and straightforward.",
    "Very user-friendly, even for my parents.",
    "Simple interface and smooth experience.",
    "Feels cheap and plasticky.",
    "Build quality could be better.",
    "Broke within the first week of use.",
    "People say they can't hear me during calls.",
    "Mic quality is terrible on Zoom meetings.",
    "Great product for the price!"
]

These opinions seize key buyer sentiments: reward for sound high quality and ease of use, complaints about battery life, construct high quality, and name/mic points, together with a constructive word on worth for cash. They replicate widespread themes present in actual person suggestions.

Defining a Pydantic Schema

This Pydantic mannequin defines the construction for the response of a semantic deduplication job on buyer opinions. This schema helps construction and validate the output of a language mannequin tasked with clustering or deduplicating pure language enter (e.g., person suggestions, bug stories, product opinions).

from pydantic import BaseModel, Subject

class DeduplicatedReviews(BaseModel):
    duplicates: listing[list[str]] = Subject(
        ..., description="An inventory of semantically equal buyer overview teams"
    )
    opinions: listing[str] = Subject(
        ..., description="The deduplicated listing of core buyer suggestions themes"
    )

Defining a Mirascope @openai.name for Semantic Deduplication

This code defines a semantic deduplication perform utilizing Mirascope’s @openai.name decorator, which allows seamless integration with OpenAI’s gpt-4o mannequin. The deduplicate_customer_reviews perform takes an inventory of buyer opinions and makes use of a structured immediate—outlined by the @prompt_template decorator—to information the LLM in figuring out and grouping semantically comparable opinions.

The system message instructs the mannequin to investigate the that means, tone, and intent behind every overview, clustering people who convey the identical suggestions even when worded in a different way. The perform expects a structured response conforming to the DeduplicatedReviews Pydantic mannequin, which incorporates two outputs: an inventory of distinctive, deduplicated overview sentiments, and an inventory of grouped duplicates.

This design ensures that the LLM’s output is each correct and machine-readable, making it perfect for buyer suggestions evaluation, survey deduplication, or product overview clustering.

from mirascope.core import openai, prompt_template

@openai.name(mannequin="gpt-4o", response_model=DeduplicatedReviews)
@prompt_template(
    """
    SYSTEM:
    You're an AI assistant serving to to investigate buyer opinions. 
    Your job is to group semantically comparable opinions collectively -- even when they're worded in a different way.

    - Use your understanding of that means, tone, and implication to group duplicates.
    - Return two lists:
      1. A deduplicated listing of the important thing distinct overview sentiments.
      2. An inventory of grouped duplicates that share the identical underlying suggestions.

    USER:
    {opinions}
    """
)
def deduplicate_customer_reviews(opinions: listing[str]): ...

The next code executes the deduplicate_customer_reviews perform utilizing an inventory of buyer opinions and prints the structured output. First, it calls the perform and shops the outcome within the response variable. To make sure that the mannequin’s output conforms to the anticipated format, it makes use of an assert assertion to validate that the response is an occasion of the DeduplicatedReviews Pydantic mannequin.

As soon as validated, it prints the deduplicated ends in two sections. The primary part, labeled “✅ Distinct Buyer Suggestions,” shows the listing of distinctive overview sentiments recognized by the mannequin. The second part, “🌀 Grouped Duplicates,” lists clusters of opinions that have been acknowledged as semantically equal.

response = deduplicate_customer_reviews(customer_reviews)

# Guarantee response format
assert isinstance(response, DeduplicatedReviews)

# Print Output
print("✅ Distinct Buyer Suggestions:")
for merchandise in response.opinions:
    print("-", merchandise)

print("n🌀 Grouped Duplicates:")
for group in response.duplicates:
    print("-", group)
AD 4nXeXEtDIZVmseoJjlYtxSHZz6fjlBHQ8OI44aHObRUI9RKMkfALAcwOf 339xVCBh26JCvlvdZHtPAMeulal7fiJg454clZf5qA Xw nPe1hWwA1au4AD 4nXeXEtDIZVmseoJjlYtxSHZz6fjlBHQ8OI44aHObRUI9RKMkfALAcwOf 339xVCBh26JCvlvdZHtPAMeulal7fiJg454clZf5qA Xw nPe1hWwA1au4

The output exhibits a clear abstract of buyer suggestions by grouping semantically comparable opinions. The Distinct Buyer Suggestions part highlights key insights, whereas the Grouped Duplicates part captures completely different phrasings of the identical sentiment. This helps eradicate redundancy and makes the suggestions simpler to investigate.


Try the complete Codes. All credit score for this analysis goes to the researchers of this undertaking.

Prepared to attach with 1 Million+ AI Devs/Engineers/Researchers? See how NVIDIA, LG AI Analysis, and prime AI firms leverage MarkTechPost to achieve their target market [Learn More]


PASSPORT SIZE PHOTO

I’m a Civil Engineering Graduate (2022) from Jamia Millia Islamia, New Delhi, and I’ve a eager curiosity in Knowledge Science, particularly Neural Networks and their utility in numerous areas.

Elevate your perspective with NextTech Information, the place innovation meets perception.
Uncover the newest breakthroughs, get unique updates, and join with a worldwide community of future-focused thinkers.
Unlock tomorrow’s developments right this moment: learn extra, subscribe to our e-newsletter, and turn out to be a part of the NextTech group at NextTech-news.com

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
NextTech
  • Website

Related Posts

Find out how to Design a Streaming Determination Agent with Partial Reasoning, On-line Replanning, and Reactive Mid-Execution Adaptation in Dynamic Environments

March 12, 2026

NVIDIA Releases Nemotron 3 Tremendous: A 120B Parameter Open-Supply Hybrid Mamba-Consideration MoE Mannequin Delivering 5x Larger Throughput for Agentic AI

March 11, 2026

Construct a Self-Designing Meta-Agent That Robotically Constructs, Instantiates, and Refines Job-Particular AI Brokers

March 11, 2026
Add A Comment
Leave A Reply Cancel Reply

Economy News

Fintechs in Kenya and Rwanda might quickly function underneath one licence

By NextTechMarch 12, 2026

Kenya and Rwanda are making ready a framework that will enable digital funds firms licenced…

Irish unicorn Tines creating 100 jobs within the US

March 12, 2026

RayNeo X3 Professional Integrates Amap Providers, Bringing “Service-Finds-Person” Expertise to AR Glasses

March 12, 2026
Top Trending

Fintechs in Kenya and Rwanda might quickly function underneath one licence

By NextTechMarch 12, 2026

Kenya and Rwanda are making ready a framework that will enable digital…

Irish unicorn Tines creating 100 jobs within the US

By NextTechMarch 12, 2026

New roles can be unfold throughout product, engineering, buyer expertise, gross sales…

RayNeo X3 Professional Integrates Amap Providers, Bringing “Service-Finds-Person” Expertise to AR Glasses

By NextTechMarch 12, 2026

On March 11, throughout AWE2026, RayNeo Innovation and Amap collectively launched “LeiNiao…

Subscribe to News

Get the latest sports news from NewsSite about world, sports and politics.

NEXTTECH-LOGO
Facebook X (Twitter) Instagram YouTube

AI & Machine Learning

Robotics & Automation

Space & Deep Tech

Web3 & Digital Economies

Climate & Sustainability Tech

Biotech & Future Health

Mobility & Smart Cities

Global Tech Pulse

Cybersecurity & Digital Rights

Future of Work & Education

Creator Economy & Culture

Trend Radar & Startup Watch

News By Region

Africa

Asia

Europe

Middle East

North America

Oceania

South America

2025 © NextTech-News. All Rights Reserved
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms Of Service
  • Advertise With Us
  • Write For Us
  • Submit Article & Press Release

Type above and press Enter to search. Press Esc to cancel.

Subscribe For Latest Updates

Sign up to best of Tech news, informed analysis and opinions on what matters to you.

Invalid email address
 We respect your inbox and never send spam. You can unsubscribe from our newsletter at any time.     
Thanks for subscribing!