Close Menu
  • Home
  • Opinion
  • Region
    • Africa
    • Asia
    • Europe
    • Middle East
    • North America
    • Oceania
    • South America
  • AI & Machine Learning
  • Robotics & Automation
  • Space & Deep Tech
  • Web3 & Digital Economies
  • Climate & Sustainability Tech
  • Biotech & Future Health
  • Mobility & Smart Cities
  • Global Tech Pulse
  • Cybersecurity & Digital Rights
  • Future of Work & Education
  • Trend Radar & Startup Watch
  • Creator Economy & Culture
What's Hot

Cineplex sending some Scene members BOGO film ticket provide

February 21, 2026

Calgary Transit adopts passenger counting know-how

February 21, 2026

Keolis Group Appoints Youenn Dupuis as CEO Center East & Jap Asia to Drive Regional Development and Operational Excellence

February 21, 2026
Facebook X (Twitter) Instagram LinkedIn RSS
NextTech NewsNextTech News
Facebook X (Twitter) Instagram LinkedIn RSS
  • Home
  • Africa
  • Asia
  • Europe
  • Middle East
  • North America
  • Oceania
  • South America
  • Opinion
Trending
  • Cineplex sending some Scene members BOGO film ticket provide
  • Calgary Transit adopts passenger counting know-how
  • Keolis Group Appoints Youenn Dupuis as CEO Center East & Jap Asia to Drive Regional Development and Operational Excellence
  • Robotic Discuss Episode 145 – Robotics and automation in manufacturing, with Agata Suwala
  • These high 30 AI brokers ship a mixture of features and autonomy
  • India’s Sarvam launches Indus AI chat app as competitors heats up
  • Horizon Europe Motion Plan to spice up Irish SME participation
  • Galgotias College Says Its Professor ‘Not Suspended’ After Her ‘Open to Work’ Standing Fuels Hypothesis
Saturday, February 21
NextTech NewsNextTech News
Home - AI & Machine Learning - Design a Swiss Military Knife Analysis Agent with Device-Utilizing AI, Net Search, PDF Evaluation, Imaginative and prescient, and Automated Reporting
AI & Machine Learning

Design a Swiss Military Knife Analysis Agent with Device-Utilizing AI, Net Search, PDF Evaluation, Imaginative and prescient, and Automated Reporting

NextTechBy NextTechFebruary 21, 2026No Comments7 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Telegram Email Copy Link
Follow Us
Google News Flipboard
Design a Swiss Military Knife Analysis Agent with Device-Utilizing AI, Net Search, PDF Evaluation, Imaginative and prescient, and Automated Reporting
Share
Facebook Twitter LinkedIn Pinterest Email


On this tutorial, we construct a “Swiss Military Knife” analysis agent that goes far past easy chat interactions and actively solves multi-step analysis issues end-to-end. We mix a tool-using agent structure with dwell net search, native PDF ingestion, vision-based chart evaluation, and automatic report technology to reveal how trendy brokers can motive, confirm, and produce structured outputs. By wiring collectively small brokers, OpenAI fashions, and sensible data-extraction utilities, we present how a single agent can discover sources, cross-check claims, and synthesize findings into professional-grade Markdown and DOCX studies.

%pip -q set up -U smolagents openai trafilatura duckduckgo-search pypdf pymupdf python-docx pillow tqdm


import os, re, json, getpass
from typing import Listing, Dict, Any
import requests
import trafilatura
from duckduckgo_search import DDGS
from pypdf import PdfReader
import fitz
from docx import Doc
from docx.shared import Pt
from datetime import datetime


from openai import OpenAI
from smolagents import CodeAgent, OpenAIModel, device


if not os.environ.get("OPENAI_API_KEY"):
   os.environ["OPENAI_API_KEY"] = getpass.getpass("Paste your OpenAI API key (hidden): ").strip()
print("OPENAI_API_KEY set:", "YES" if os.environ.get("OPENAI_API_KEY") else "NO")


if not os.environ.get("SERPER_API_KEY"):
   serper = getpass.getpass("Non-obligatory: Paste SERPER_API_KEY for Google outcomes (press Enter to skip): ").strip()
   if serper:
       os.environ["SERPER_API_KEY"] = serper
print("SERPER_API_KEY set:", "YES" if os.environ.get("SERPER_API_KEY") else "NO")


consumer = OpenAI()


def _now():
   return datetime.utcnow().strftime("%Y-%m-%d %H:%M:%SZ")


def _safe_filename(s: str) -> str:
   s = re.sub(r"[^a-zA-Z0-9._-]+", "_", s).strip("_")
   return s[:180] if s else "file"

We arrange the total execution atmosphere and securely load all required credentials with out hardcoding secrets and techniques. We import all dependencies required for net search, doc parsing, imaginative and prescient evaluation, and agent orchestration. We additionally initialize shared utilities to standardize timestamps and file naming all through the workflow.

attempt:
   from google.colab import information
   os.makedirs("/content material/pdfs", exist_ok=True)
   uploaded = information.add()
   for identify, information in uploaded.gadgets():
       if identify.decrease().endswith(".pdf"):
           with open(f"/content material/pdfs/{identify}", "wb") as f:
               f.write(information)
   print("PDFs in /content material/pdfs:", os.listdir("/content material/pdfs"))
besides Exception as e:
   print("Add skipped:", str(e))


def web_search(question: str, ok: int = 6) -> Listing[Dict[str, str]]:
   serper_key = os.environ.get("SERPER_API_KEY", "").strip()
   if serper_key:
       resp = requests.put up(
           "https://google.serper.dev/search",
           headers={"X-API-KEY": serper_key, "Content material-Kind": "software/json"},
           json={"q": question, "num": ok},
           timeout=30,
       )
       resp.raise_for_status()
       information = resp.json()
       out = []
       for merchandise in (information.get("natural") or [])[:k]:
           out.append({
               "title": merchandise.get("title",""),
               "url": merchandise.get("hyperlink",""),
               "snippet": merchandise.get("snippet",""),
           })
       return out


   out = []
   with DDGS() as ddgs:
       for r in ddgs.textual content(question, max_results=ok):
           out.append({
               "title": r.get("title",""),
               "url": r.get("href",""),
               "snippet": r.get("physique",""),
           })
   return out


def fetch_url_text(url: str) -> Dict[str, Any]:
   attempt:
       downloaded = trafilatura.fetch_url(url, timeout=30)
       if not downloaded:
           return {"url": url, "okay": False, "error": "fetch_failed", "textual content": ""}
       textual content = trafilatura.extract(downloaded, include_comments=False, include_tables=True)
       if not textual content:
           return {"url": url, "okay": False, "error": "extract_failed", "textual content": ""}
       title_guess = subsequent((ln.strip() for ln in textual content.splitlines() if ln.strip()), "")[:120]
       return {"url": url, "okay": True, "title_guess": title_guess, "textual content": textual content}
   besides Exception as e:
       return {"url": url, "okay": False, "error": str(e), "textual content": ""}

We allow native PDF ingestion and set up a versatile net search pipeline that works with or with out a paid search API. We present how we gracefully deal with non-obligatory inputs whereas sustaining a dependable analysis stream. We additionally implement sturdy URL fetching and textual content extraction to organize clear supply materials for downstream reasoning.

def read_pdf_text(pdf_path: str, max_pages: int = 30) -> Dict[str, Any]:
   reader = PdfReader(pdf_path)
   pages = min(len(reader.pages), max_pages)
   chunks = []
   for i in vary(pages):
       attempt:
           chunks.append(reader.pages[i].extract_text() or "")
       besides Exception:
           chunks.append("")
   return {"pdf_path": pdf_path, "pages_read": pages, "textual content": "nn".be part of(chunks).strip()}


def extract_pdf_images(pdf_path: str, out_dir: str = "/content material/extracted_images", max_pages: int = 10) -> Listing[str]:
   os.makedirs(out_dir, exist_ok=True)
   doc = fitz.open(pdf_path)
   saved = []
   pages = min(len(doc), max_pages)
   base = _safe_filename(os.path.basename(pdf_path).rsplit(".", 1)[0])


   for p in vary(pages):
       web page = doc[p]
       img_list = web page.get_images(full=True)
       for img_i, img in enumerate(img_list):
           xref = img[0]
           pix = fitz.Pixmap(doc, xref)
           if pix.n - pix.alpha >= 4:
               pix = fitz.Pixmap(fitz.csRGB, pix)
           img_path = os.path.be part of(out_dir, f"{base}_p{p+1}_img{img_i+1}.png")
           pix.save(img_path)
           saved.append(img_path)


   doc.shut()
   return saved


def vision_analyze_image(image_path: str, query: str, mannequin: str = "gpt-4.1-mini") -> Dict[str, Any]:
   with open(image_path, "rb") as f:
       img_bytes = f.learn()


   resp = consumer.responses.create(
       mannequin=mannequin,
       enter=[{
           "role": "user",
           "content": [
               {"type": "input_text", "text": f"Answer concisely and accurately.nnQuestion: {question}"},
               {"type": "input_image", "image_data": img_bytes},
           ],
       }],
   )
   return {"image_path": image_path, "reply": resp.output_text}

We deal with deep doc understanding by extracting structured textual content and visible artifacts from PDFs. We combine a vision-capable mannequin to interpret charts and figures as a substitute of treating them as opaque photos. We be sure that numerical tendencies and visible insights will be transformed into express, text-based proof.

def write_markdown(path: str, content material: str) -> str:
   os.makedirs(os.path.dirname(path), exist_ok=True)
   with open(path, "w", encoding="utf-8") as f:
       f.write(content material)
   return path


def write_docx_from_markdown(docx_path: str, md: str, title: str = "Analysis Report") -> str:
   os.makedirs(os.path.dirname(docx_path), exist_ok=True)
   doc = Doc()
   t = doc.add_paragraph()
   run = t.add_run(title)
   run.daring = True
   run.font.dimension = Pt(18)
   meta = doc.add_paragraph()
   meta.add_run(f"Generated: {_now()}").italic = True
   doc.add_paragraph("")
   for line in md.splitlines():
       line = line.rstrip()
       if not line:
           doc.add_paragraph("")
           proceed
       if line.startswith("# "):
           doc.add_heading(line[2:].strip(), stage=1)
       elif line.startswith("## "):
           doc.add_heading(line[3:].strip(), stage=2)
       elif line.startswith("### "):
           doc.add_heading(line[4:].strip(), stage=3)
       elif re.match(r"^s*[-*]s+", line):
           p = doc.add_paragraph(type="Listing Bullet")
           p.add_run(re.sub(r"^s*[-*]s+", "", line).strip())
       else:
           doc.add_paragraph(line)
   doc.save(docx_path)
   return docx_path


@device
def t_web_search(question: str, ok: int = 6) -> str:
   return json.dumps(web_search(question, ok), ensure_ascii=False)


@device
def t_fetch_url_text(url: str) -> str:
   return json.dumps(fetch_url_text(url), ensure_ascii=False)


@device
def t_list_pdfs() -> str:
   pdf_dir = "/content material/pdfs"
   if not os.path.isdir(pdf_dir):
       return json.dumps([])
   paths = [os.path.join(pdf_dir, f) for f in os.listdir(pdf_dir) if f.lower().endswith(".pdf")]
   return json.dumps(sorted(paths), ensure_ascii=False)


@device
def t_read_pdf_text(pdf_path: str, max_pages: int = 30) -> str:
   return json.dumps(read_pdf_text(pdf_path, max_pages=max_pages), ensure_ascii=False)


@device
def t_extract_pdf_images(pdf_path: str, max_pages: int = 10) -> str:
   imgs = extract_pdf_images(pdf_path, max_pages=max_pages)
   return json.dumps(imgs, ensure_ascii=False)


@device
def t_vision_analyze_image(image_path: str, query: str) -> str:
   return json.dumps(vision_analyze_image(image_path, query), ensure_ascii=False)


@device
def t_write_markdown(path: str, content material: str) -> str:
   return write_markdown(path, content material)


@device
def t_write_docx_from_markdown(docx_path: str, md_path: str, title: str = "Analysis Report") -> str:
   with open(md_path, "r", encoding="utf-8") as f:
       md = f.learn()
   return write_docx_from_markdown(docx_path, md, title=title)

We implement the total output layer by producing Markdown studies and changing them into polished DOCX paperwork. We expose all core capabilities as express instruments that the agent can motive about and invoke step-by-step. We be sure that each transformation from uncooked information to remaining report stays deterministic and inspectable.

mannequin = OpenAIModel(model_id="gpt-5")


agent = CodeAgent(
   instruments=[
       t_web_search,
       t_fetch_url_text,
       t_list_pdfs,
       t_read_pdf_text,
       t_extract_pdf_images,
       t_vision_analyze_image,
       t_write_markdown,
       t_write_docx_from_markdown,
   ],
   mannequin=mannequin,
   add_base_tools=False,
   additional_authorized_imports=["json","re","os","math","datetime","time","textwrap"],
)


SYSTEM_INSTRUCTIONS = """
You're a Swiss Military Knife Analysis Agent.
"""


def run_research(matter: str):
   os.makedirs("/content material/report", exist_ok=True)
   immediate = f"""{SYSTEM_INSTRUCTIONS.strip()}


Analysis query:
{matter}


Steps:
1) Listing obtainable PDFs (if any) and determine that are related.
2) Do net seek for the subject.
3) Fetch and extract the textual content of the very best sources.
4) If PDFs exist, extract textual content and pictures.
5) Visually analyze figures.
6) Write a Markdown report and convert to DOCX.
"""
   return agent.run(immediate)


matter = "Construct a analysis transient on probably the most dependable design patterns for tool-using brokers (2024-2026), specializing in analysis, citations, and failure modes."
out = run_research(matter)
print(out[:1500] if isinstance(out, str) else out)


attempt:
   from google.colab import information
   information.obtain("/content material/report/report.md")
   information.obtain("/content material/report/report.docx")
besides Exception as e:
   print("Obtain skipped:", str(e))

We assemble the whole analysis agent and outline a structured execution plan for multi-step reasoning. We information the agent to look, analyze, synthesize, and write utilizing a single coherent immediate. We reveal how the agent produces a completed analysis artifact that may be reviewed, shared, and reused instantly.

In conclusion, we demonstrated how a well-designed tool-using agent can perform as a dependable analysis assistant relatively than a conversational toy. We showcased how express instruments, disciplined prompting, and step-by-step execution permit the agent to look the online, analyze paperwork and visuals, and generate traceable, citation-aware studies. This method affords a sensible blueprint for constructing reliable analysis brokers that emphasize analysis, proof, and failure consciousness, capabilities more and more important for real-world AI programs.


Try the Full Codes right here. Additionally, be at liberty to observe us on Twitter and don’t neglect to hitch our 100k+ ML SubReddit and Subscribe to our Publication. Wait! are you on telegram? now you may be part of us on telegram as properly.


Elevate your perspective with NextTech Information, the place innovation meets perception.
Uncover the newest breakthroughs, get unique updates, and join with a world community of future-focused thinkers.
Unlock tomorrow’s tendencies at present: learn extra, subscribe to our e-newsletter, and turn into a part of the NextTech neighborhood at NextTech-news.com

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
NextTech
  • Website

Related Posts

A Coding Information to Excessive-High quality Picture Era, Management, and Enhancing Utilizing HuggingFace Diffusers

February 21, 2026

NVIDIA Releases DreamDojo: An Open-Supply Robotic World Mannequin Educated on 44,711 Hours of Actual-World Human Video Information

February 20, 2026

Find out how to Construct Clear AI Brokers: Traceable Determination-Making with Audit Trails and Human Gates

February 20, 2026
Add A Comment
Leave A Reply Cancel Reply

Economy News

Cineplex sending some Scene members BOGO film ticket provide

By NextTechFebruary 21, 2026

Choose Scene members are receiving e-mails from Cineplex providing a second free ticket for a…

Calgary Transit adopts passenger counting know-how

February 21, 2026

Keolis Group Appoints Youenn Dupuis as CEO Center East & Jap Asia to Drive Regional Development and Operational Excellence

February 21, 2026
Top Trending

Cineplex sending some Scene members BOGO film ticket provide

By NextTechFebruary 21, 2026

Choose Scene members are receiving e-mails from Cineplex providing a second free…

Calgary Transit adopts passenger counting know-how

By NextTechFebruary 21, 2026

Round 70 per cent of the CTrain fleet have APCs, which is…

Keolis Group Appoints Youenn Dupuis as CEO Center East & Jap Asia to Drive Regional Development and Operational Excellence

By NextTechFebruary 21, 2026

Keolis Group, a worldwide chief in shared mobility and public transport, right…

Subscribe to News

Get the latest sports news from NewsSite about world, sports and politics.

NEXTTECH-LOGO
Facebook X (Twitter) Instagram YouTube

AI & Machine Learning

Robotics & Automation

Space & Deep Tech

Web3 & Digital Economies

Climate & Sustainability Tech

Biotech & Future Health

Mobility & Smart Cities

Global Tech Pulse

Cybersecurity & Digital Rights

Future of Work & Education

Creator Economy & Culture

Trend Radar & Startup Watch

News By Region

Africa

Asia

Europe

Middle East

North America

Oceania

South America

2025 © NextTech-News. All Rights Reserved
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms Of Service
  • Advertise With Us
  • Write For Us
  • Submit Article & Press Release

Type above and press Enter to search. Press Esc to cancel.

Subscribe For Latest Updates

Sign up to best of Tech news, informed analysis and opinions on what matters to you.

Invalid email address
 We respect your inbox and never send spam. You can unsubscribe from our newsletter at any time.     
Thanks for subscribing!