JustToThePoint English Website Version
JustToThePoint en español

From Minimal Chatbot to Full-Featured AI CLI 2

To err is human, to blame it on someone else is even more human, Jacob’s Law

Topology and Limits

Ollama is a lightweight, privacy-focused platform that lets you run large language models (LLMs) locally on your own machine —no cloud dependency or costly monthly subscriptions required. It’s designed to make working with models like Llama 3, DeepSeek, Gemma, and others as simple as running a command in your terminal.

This is the third article in our three-part series about Complete Windows AI Dev Setup: WSL 2, Docker Desktop, Python & Ollama and is a continuation of the first two articles. If you haven’t already, please read parts 1 and 2 first, then come back here.

We’ll use Ollama’s LLM to refine your plain-English prompt into an optimized DuckDuckGo query, fetch the top five results, scrape their text with Trafilatura, and then summarize each page’s content.

Trafilatura

Trafilatura is a Python package and command-line tool designed to extract relevant text directly from HTML code of web pages. It simplifies the process of extracting structured, meaningful data from HTML sources through web crawling, scraping, and extraction techniques. It turns noisy markup into clean, structured text, so you can focus on analysis instead of boilerplate clutter.

Key Features

Duckduckgo_search

Leverage DuckDuckGo’s HTML endpoint to retrieve titles, URLs, and snippets for your query —then feed each result into Trafilatura (and Ollama) for on-the-fly content analysis.

# vim queryweb.com
import ollama # Import the Ollama library for interacting with the AI model
import requests # Import requests library for making HTTP requests
from bs4 import BeautifulSoup # Import BeautifulSoup for parsing HTML and XML documents
from trafilatura.settings import use_config # Importing trafilatura settings for better HTML parsing
import trafilatura # Import trafilatura for web crawling and content extraction
import asyncio # Import asyncio for asynchronous programming
import mymessages # Import custom messages module for predefined message templates
from duckduckgo_search import DDGS # Import DDGS for performing DuckDuckGo searches
from colorama import Fore, Style # Import Fore and Style for colored terminal text output
from util import display_text_color # Importing utility function for colored text display

def query_generator(model_name: str, messages: list[dict]) -> str:
    """
    Create a DuckDuckGo search query based on the most recent user message in the conversation history.

    Args:
        model_name: Ollama model name (e.g. 'deepseek-r1:8b')
        messages: History of chat messages (each a dict with 'role' & 'content')

    Returns:
        The text of a DuckDuckGo query.
    """
    # It checks if the messages list is not empty and ensures that the last message contains a 'content' field.
    # If not, it raises a ValueError.
    if not messages or 'content' not in messages[-1]:
        raise ValueError("`messages` must be a non-empty list of dicts with a 'content' field")

    # It retrieves the content of the last user message, which will be used to generate a search query.
    user_prompt = messages[-1]['content']

    # An instruction string is constructed that tells the model to create a DuckDuckGo query based on the user's prompt.
    instruction = (
        "CREATE A DUCKDUCKGO QUERY FOR THIS PROMPT:\n" # Instruction for the model
        f"{user_prompt}" # Include the user's prompt in the instruction
    )

    # Send the system + user instruction to the model:
    try:
        resp = ollama.chat(
            model=model_name, # Specify the model to use
            messages=[
                mymessages.query_msg, # Include a predefined message from mymessages module
                {"role": "user", "content": instruction} # Send the previous constructed instruction
            ]
        )
    except Exception as e:
        # Graceful error reporting
        raise RuntimeError(f"Ollama chat failed: {e}") # Raise an error with a message if the chat fails

    # Return the content of the response from the model, which is expected to be the generated DuckDuckGo query.
    return resp.message.content


def ai_web_search(query, model_name="deepseek-r1:8b"):
    """
    It generates a refined search query based on the user's input and the conversation history, leveraging the specified AI model to enhance the quality of the search terms.

    Args:
        query (str): The search query to use
        model_name (str): Name of the model to use for generating the query

    Returns:
        str: It returns the refined search query, ready for use in further web searches.
    """
    # Initialize conversation with system and user prompts
    messages = [
        mymessages.assistant_msg, # System prompt from mymessages module
        mymessages.myuser_msg, # User prompt from mymessages module
    ]

    # Append the user's query to the conversation
    messages.append({"role": "user", "content": query})

    # It calls query_generator to generate a refined search query based on the conversation history. This utilizes the specified AI model.
    search_query = query_generator(model_name, messages)

    # Extract the last line of the generated query
    last_sentence = search_query.splitlines()[-1]   # Get the last line of the generated query
    last_sentence = last_sentence.strip() # Remove leading and trailing whitespace

    # Check if the last sentence starts and ends with quotes
    if last_sentence[0] == '"':
        # Remove the first and last quotes if they exist
        last_sentence = last_sentence[1:-1]

    return last_sentence # Return the final processed search query

    def duckduckgo_search(query: str, max_results = 5):
    """
    Perform a DuckDuckGo search and return up to `max_results` hits.

    Args:
        query (str): The search query to use

    Returns:
        str: The first search result URL or an error message
    """

    print(f"🔎  DuckDuckGo: {query}") # Print the search query being executed
    ddgs = DDGS(timeout=20) # Initialize DuckDuckGo Search with a timeout of 20 seconds

    # Iterate over search results, retrieving a maximum of `max_results`
    for idx, result in enumerate(ddgs.text(query, max_results=max_results), start=1):
        # Display the title and URL of each result in a formatted manner
        display_text_color(f"{idx}. {result['title']}\n   {result["href"]}", Fore.MAGENTA)

        # Call a function to scrape web content from the result URL
        scrape_web_content(result["href"]) # Each URL is passed to a web scraping function to extract additional content.

    print("— end of results —") # Indicate the end of search results

def my_duckduckgo_search(query: str, model_name="qwen3:8b"):
    """
    Perform a DuckDuckGo search and return the five firsts results.

    Args:
        query (str): The search query to use
        model_name (str): Name of the model to use for generating the query

    Returns:
        None: It does not return anything, but prints the search results and scrapes the content of each result.
    """
    # Generate an improved search query using the ai_web_search function
    improved_query = ai_web_search(query,"qwen3:8b")

    # Perform the DuckDuckGo search with the improved query
    duckduckgo_search(improved_query)

    def scrape_web_content(url = "", model_name="deepseek-r1:8b"):
    """
    Scrape and summarize web content using Ollama.

    Args:
        url (str): The URL of the web page to scrape

    Returns:
        str: Summary text or error message
    """
    try:
        print(f"Scraping content from: {url}") # Debug print statement
    # Validate URL
    if not url.startswith(("http://", "https://")):
        return "Invalid URL format"

    try:
        # Configure trafilatura (better HTML parsing)
        config = use_config()
        config.set("DEFAULT", "EXTRACTION_TIMEOUT", "0")

        # Fetch and extract content
        downloaded = trafilatura.fetch_url(url)
        if not downloaded:
            return f"No content available at {url}"

        content = trafilatura.extract(
            downloaded,
            include_formatting=False,
            include_links=True,
            config=config
        )
        if not content:
            return f"Failed to extract content from {url}"

        # Summarize the downloaded content
        messages = [
                mymessages.query_summarize, # System prompt from messages module
                {"role": "user", "content": content} # User prompt with scraped content
        ]

        # Call Ollama API
        try:
            response = ollama.chat(model=model_name, messages=messages)
            display_text_color(f"Response from model: {response.message.content}", Fore.GREEN)
        except Exception as e:
                # Graceful error reporting
                raise RuntimeError(f"Ollama chat failed: {e}")
    except requests.RequestException as e:
        return f"Error fetching content from {url}: {e}"
if __name__ == "__main__":
    my_duckduckgo_search("bounded sets","qwen3:8b")  # Example usage of the ai_web_search function

Prompt Templates (mymessages.py)

We have separated our “system” and “user” instructions into four JSON-style message dictionaries. At runtime, we will prepend these to every Ollama call to steer the model consistently:

# assistant_msg is the system prompt, telling the model how to behave overall.
# It enforces accuracy, humility (“say ‘I don’t know’”), and avoids needless back-and-forth clarifications.
assistant_msg = {
    'role': 'assistant',
    'content': (
        'You are a helpful assistant designed to provide information and answer questions.'
        'You should always strive to give accurate, comprehensive, and helpful responses.'
        'If you do not know the answer, it is better to say "I do not know" than to provide incorrect information.'
        'You should also avoid making assumptions about the user\'s intent or knowledge level.'
        'Do not ask for clarification if the user\'s question is ambiguous or unclear.'
    )
}

# It makes it clear to the assistant that the user is constructive and intentional — setting the stage for clear dialogue and better results.
myuser_msg = {
    'role': 'user',
    'content': (
        'You are a user who can ask questions and provide input to the assistant.'
        'You should ask clear and specific questions to get the best responses.'
    )
}

# query_msg is used when we run ollama.chat(...) to generate DuckDuckGo queries.
# It ensures the model outputs exactly one line in the prescribed "QUERY: …" format, with no extra commentary.
query_msg = {
    'role': 'system',
    'content': (
        'You are an English AI web search query generator model based on user input.'
        'Your goal is to provide a single query that will likely yield the most relevant and useful results.'
        'You must ensure that the query is clear, concise, and specific to the topic at hand.'
        'After thinking, the last line should have the format: "QUERY: "'
        'Do not include after that any additional ideas, text or explanations'
    )
}

# It instructs the assistant to generate well-structured, accurate, and concise summaries of web-scraped content.
query_summarize = {
    'role': 'system',
    'content': (
        'Could you please provide a concise and comprehensive summary of the given text?'
        'The summary should capture the main points and key details of the text while conveying the author\'s intended meaning accurately.'
        'Please ensure that the summary is well-organized and easy to read, with clear headings and subheadings to guide the reader through each section.'
        'The length of the summary should be appropriate to capture the main points and key details of the text, without including unnecessary information or becoming overly long.'
        'To ensure accuracy, please read the text carefully and pay attention to any nuances or complexities in the language.'
    )
}

By cleanly separating prompt templates from runtime configuration, our code remains modular, testable, and easy to extend.

Environment Variables (.env)

The .env file lets you adjust runtime parameters without changing code. We load these via python-dotenv early in main():

# Local Hugo Server
BASE_URL = "http://192.168.1.36:1313"
# Global timeout in seconds for all HTTP requests.
# Prevents our CLI from hanging indefinitely if a site is unresponsive.
REQUEST_TIMEOUT = 15
# Name of the Ollama model we will chat with by default
MODEL = 'deepseek-r1:8b'
# Boolean flag (True/False) to enable or disable web crawling.
CRAWL = True
Bitcoin donation

JustToThePoint Copyright © 2011 - 2025 Anawim. ALL RIGHTS RESERVED. Bilingual e-books, articles, and videos to help your child and your entire family succeed, develop a healthy lifestyle, and have a lot of fun. Social Issues, Join us.

This website uses cookies to improve your navigation experience.
By continuing, you are consenting to our use of cookies, in accordance with our Cookies Policy and Website Terms and Conditions of use.