PYTHON

50+ Python Projects to Build in 2026 (With GitHub Links)

By BuildIdeas Team·June 2026·7 min read
Updated: June 2026

Beginner Projects

Personal Finance CLI Tracker

A command-line personal finance manager where users log transactions, set monthly budgets per category, and generate summary reports as rich terminal tables. Implements CSV/JSON persistence, colored output with Rich, and exportable monthly statements. Teaches Python file I/O, data modeling, type hints, and building ergonomic CLI tools with Typer.

Python 3.12TyperRichPandasCSV/JSON
View on GitHub

GitHub Repository Analyzer

A tool that queries the GitHub REST API to analyse any public user or organisation — fetching repositories, language breakdowns, star counts, commit frequency, and contributor stats — then renders everything as an interactive terminal dashboard. Introduces async HTTP, API pagination, rate-limit handling, and data aggregation patterns.

Python 3.12httpxRichTyperGitHub REST API
View on GitHub

Automated PDF Report Generator

A script that ingests structured data (CSV or JSON), applies configurable templates, and generates professional multi-page PDF reports with charts, tables, headers, and footers. Useful for finance, HR, or sales reporting. Covers data transformation with Pandas, chart generation with Matplotlib, and PDF assembly with fpdf2.

Python 3.12fpdf2PandasMatplotlibJinja2
View on GitHub

Email Automation & Newsletter System

A bulk email platform where users compose Markdown newsletters, segment subscriber lists from CSV, personalise each email with Jinja2 templating, schedule sends, and track open rates via unique pixel links. Teaches SMTP handling, async sending with aiosmtplib, and building simple but reliable automation pipelines.

Python 3.12aiosmtplibJinja2PandasscheduleSQLite
View on GitHub

Desktop File Organiser & Duplicate Finder

A smart file management utility that watches a target directory, auto-sorts files into category folders (Images, Documents, Videos, Code), detects duplicates using perceptual hashing, and generates a cleanup report. Covers pathlib, watchdog, hashlib, and building safe file-system automation with confirmation prompts.

Python 3.12watchdogpathlibimagehashRichTyper
View on GitHub

RSS Feed Aggregator & Digest Emailer

A scheduled feed reader that pulls articles from configurable RSS/Atom sources, deduplicates entries, applies keyword filters, stores them in SQLite, and emails a daily digest with clickable article cards rendered in HTML. Teaches XML/feed parsing, scheduling, persistent storage, and HTML email composition.

Python 3.12feedparserscheduleaiosmtplibJinja2SQLite
View on GitHub

Markdown-Powered Static Site Generator

A lightweight static site generator that reads Markdown files from a content directory, applies Jinja2 HTML templates, generates tag/category index pages, builds a sitemap.xml, and outputs a deployable static site. A practical deep-dive into file processing, templating engines, and understanding how tools like Jekyll work under the hood.

Python 3.12Jinja2MarkdownPyYAMLpathlibTyper
View on GitHub

Port Scanner & Network Recon Tool

An async port scanner that concurrently probes TCP/UDP ports on a given host, detects common service banners, maps open services, and outputs structured JSON or a Rich terminal table. A strong beginner DevSecOps project covering asyncio, socket programming, and network fundamentals.

Python 3.12asynciosocketRichTyperJSON
View on GitHub

Stock Price Alert System

A background service that polls real-time stock and crypto prices from a public API, compares them against user-configured thresholds, and fires alerts via email or SMS. Teaches polling loops, event-driven alerting, external API integration, and lightweight persistence with SQLite.

Python 3.12httpxscheduleSQLitesmtplibTyperRich
View on GitHub

Image Batch Processing Pipeline

A CLI tool that applies configurable transformations to folders of images — resizing, format conversion, watermarking, EXIF stripping, and WebP compression — using concurrent workers for fast bulk processing. Covers Pillow image manipulation, concurrent.futures, progress bars with tqdm, and building reusable processing pipelines.

Python 3.12Pillowconcurrent.futurestqdmTyperRich
View on GitHub

Web Scraping Pipeline with Data Export

An async web scraper that extracts structured product data (name, price, rating, availability) from e-commerce pages, deduplicates records, stores results in SQLite, and exports to CSV/JSON/Excel. Implements rotating user agents, retry logic, and politeness delays — practical production scraping patterns.

Python 3.12httpxBeautifulSoup4asyncioPandasSQLite
View on GitHub

Flashcard & Spaced Repetition Study App

A terminal-based flashcard application with spaced repetition scheduling (SM-2 algorithm), deck management, progress tracking, and markdown-rendered card content. Data persists in SQLite. Teaches algorithm implementation, date arithmetic, object-oriented design, and interactive terminal UIs with Rich.

Python 3.12RichSQLiteTyperdataclasses
View on GitHub

YouTube Playlist Downloader & Organiser

A CLI tool that downloads YouTube playlists and channels, embeds metadata (thumbnail, title, description, chapters) into MP4/MP3 files, organises them into artist/album folder structures, and maintains a local database to avoid re-downloading. Covers subprocess management, async I/O, and building reliable media automation utilities.

Python 3.12yt-dlpmutagenSQLiteTyperRich
View on GitHub

Password Manager with Encryption

A local command-line password manager that stores credentials AES-256 encrypted in a local SQLite vault, supports categories and search, generates strong passwords, copies secrets to clipboard with auto-clear, and exports to an encrypted backup file. A security-focused project covering cryptography, key derivation, and building trustworthy secrets management.

Python 3.12cryptographySQLitepyperclipTyperRich
View on GitHub

GitHub Actions Workflow Generator

A CLI tool that takes a project description, detects the tech stack from the repository structure, and scaffolds production-ready GitHub Actions CI/CD YAML workflows including test, lint, build, and deploy steps. Teaches code generation, YAML manipulation, file system introspection, and building developer productivity tooling.

Python 3.12PyYAMLJinja2TyperRichpathlib
View on GitHub

Intermediate Projects

FastAPI Authentication & Authorization Service

A production-ready auth microservice with JWT access/refresh tokens, OAuth2 social login (Google, GitHub), role-based access control, email verification, password reset flows, and rate limiting. Implements async SQLAlchemy with PostgreSQL, Redis token blacklisting, and full test coverage with pytest. The definitive FastAPI auth reference project.

Python 3.12FastAPISQLAlchemyPostgreSQLRedisJWTOAuth2Docker
View on GitHub

Real-Time Chat API with WebSockets

A scalable real-time messaging backend supporting multiple chat rooms, user presence indicators, typing events, message history with pagination, and file attachment uploads to S3. Built with FastAPI WebSockets and Redis Pub/Sub for horizontal scaling across multiple instances. Covers async patterns, connection lifecycle management, and event broadcasting.

Python 3.12FastAPIWebSocketsRedisPostgreSQLSQLAlchemyAWS S3Docker
View on GitHub

Django E-Commerce Platform with Stripe

A full-featured e-commerce platform with product catalogue, faceted search, cart and wishlist, multi-step checkout, Stripe payment integration, order management, inventory tracking, and a seller dashboard. Implements Celery for order confirmation emails and Redis caching for product listings. A production-quality Django reference architecture.

Python 3.12DjangoDRFStripeCeleryRedisPostgreSQLDocker
View on GitHub

Async ETL Pipeline with Apache Airflow

A data engineering pipeline that extracts data from REST APIs and PostgreSQL, transforms and cleans it with Pandas, and loads it into a Snowflake data warehouse on a scheduled basis using Airflow DAGs. Implements incremental loading, SCD Type 2 history tracking, data quality checks, and alerting on pipeline failures.

Python 3.12Apache AirflowPandasPostgreSQLSnowflakeAWS S3Docker
View on GitHub

RAG-Powered Document Q&A Chatbot

A Retrieval-Augmented Generation chatbot that ingests PDF, DOCX, and TXT files, chunks and embeds them into a ChromaDB vector store, and answers natural-language questions with cited source passages. Features a FastAPI backend with streaming responses and a Streamlit frontend. Covers LangChain pipelines, vector search, and LLM prompt engineering.

Python 3.12LangChainOpenAIChromaDBFastAPIStreamlitDocker
View on GitHub

Machine Learning Model Serving API

A production-grade ML model serving platform where trained scikit-learn and XGBoost models are loaded at startup, exposed as typed FastAPI endpoints with Pydantic validation, versioned via a model registry, and monitored with Prometheus metrics and Grafana dashboards. Covers the full MLOps serving loop from training to inference in production.

Python 3.12FastAPIscikit-learnXGBoostMLflowPrometheusGrafanaDocker
View on GitHub

Background Job Processing Platform

A distributed task queue platform built on FastAPI and Celery where users submit long-running jobs (image processing, report generation, ML inference) via REST endpoints, monitor job status in real time via WebSocket, and inspect task history in a Flower dashboard. Covers distributed worker patterns, result backends, and async job orchestration.

Python 3.12FastAPICeleryRedisRabbitMQPostgreSQLFlowerDocker
View on GitHub

Full-Stack Blog Platform with FastAPI & Next.js

A headless blog CMS with a FastAPI backend (JWT auth, CRUD posts, tags, comments, image uploads to S3) and a Next.js 14 SSR frontend. Features draft/published workflow, Markdown-to-HTML rendering, RSS feed generation, full-text search with PostgreSQL, and an admin dashboard. A canonical full-stack Python + TypeScript architecture showcase.

Python 3.12FastAPIPostgreSQLSQLAlchemyNext.jsTypeScriptAWS S3Docker
View on GitHub

Real-Time Data Streaming Dashboard

A live analytics dashboard that ingests events from a Kafka topic (e.g., website clickstream data), processes them with a Python consumer pipeline applying windowed aggregations, persists results to PostgreSQL, and streams updates to a React frontend via Server-Sent Events. Covers event streaming, time-series aggregation, and building reactive data products.

Python 3.12KafkaFastAPIPostgreSQLSSEReactDocker
View on GitHub

LLM-Powered SQL Query Generator

A natural-language-to-SQL tool where users describe data questions in plain English and an LLM generates the corresponding SQL query, executes it against a connected PostgreSQL database, and renders results as formatted tables or charts. Implements schema introspection, query validation, and injection-safe execution. A strong AI engineering showcase.

Python 3.12LangChainOpenAIPostgreSQLFastAPIStreamlitSQLAlchemy
View on GitHub

FastAPI Microservices with Docker Compose

A set of loosely coupled microservices (user service, product service, order service, notification service) communicating via REST and RabbitMQ, deployed together with Docker Compose, documented with OpenAPI, and observed via a shared Jaeger tracing setup. Teaches service decomposition, inter-service communication, and distributed tracing in Python.

Python 3.12FastAPIPostgreSQLRabbitMQDocker ComposeOpenTelemetryJaeger
View on GitHub

Scraping & Price Intelligence Platform

A production scraping system that monitors competitor product prices across multiple e-commerce sites, stores time-series price history in PostgreSQL, detects price drops above a threshold, fires webhook notifications, and visualises trends in a Streamlit dashboard. Covers Scrapy spiders, scheduler integration, deduplication, and building actionable intelligence pipelines.

Python 3.12ScrapyPostgreSQLCeleryRedisStreamlitDocker
View on GitHub

Computer Vision Object Detection API

A FastAPI service that accepts image uploads, runs YOLO v8 object detection, returns bounding boxes with class labels and confidence scores, and stores annotated images to S3. Includes a simple React frontend for drag-and-drop image analysis. Covers model loading, async inference, image annotation with OpenCV, and building production CV APIs.

Python 3.12FastAPIYOLOv8 (Ultralytics)OpenCVAWS S3ReactDocker
View on GitHub

Personal Knowledge Graph Builder

A tool that parses notes, documents, and web pages, extracts named entities and semantic relationships using spaCy and LLMs, builds a graph database (Neo4j or NetworkX), and lets users query and visualise their personal knowledge network. Covers NLP entity extraction, graph data modelling, and building knowledge management products.

Python 3.12spaCyLangChainOpenAINeo4jFastAPINetworkXStreamlit
View on GitHub

Automated Code Review Bot

A GitHub App that listens to pull request webhooks, sends the diff to an LLM with a custom review prompt, posts inline review comments on the PR, assigns severity labels, and tracks review history in PostgreSQL. A practical agentic AI project covering GitHub API webhooks, OAuth App authentication, LangChain, and async event processing.

Python 3.12FastAPILangChainOpenAIGitHub APIPostgreSQLDocker
View on GitHub

Django Multi-Tenant SaaS Boilerplate

A production SaaS starter built on Django with organisation-level multi-tenancy using django-tenants (separate schemas per org), Stripe subscription billing, team member invitations, role-based permissions, Celery async tasks, and an admin analytics dashboard. The most complete Django SaaS architecture reference for job interviews.

Python 3.12Djangodjango-tenantsStripeCeleryPostgreSQLRedisDocker
View on GitHub

Sentiment Analysis API for Product Reviews

A FastAPI service that accepts product review text, applies a fine-tuned BERT sentiment classifier (positive/negative/neutral) with aspect-level analysis (delivery, quality, price), caches results in Redis, and exposes a Streamlit analytics dashboard showing aggregate sentiment trends. Covers Transformers fine-tuning, model serving, and NLP product building.

Python 3.12FastAPIHuggingFace TransformersPyTorchRedisStreamlitDocker
View on GitHub

Data Quality & Validation Framework

A configurable data quality platform that runs validation rules (schema checks, null rates, distribution drift, referential integrity) against PostgreSQL tables or Pandas DataFrames on a schedule, stores quality metrics in a time-series database, and alerts data engineers to regressions via Slack or email. Covers Great Expectations, data contracts, and DataOps best practices.

Python 3.12Great ExpectationsPrefectPostgreSQLPandasSlack APIDocker
View on GitHub

Async Web Crawler & Search Indexer

A high-performance async web crawler that crawls a domain, extracts and indexes page content, builds an inverted index in Redis, and exposes a FastAPI search endpoint supporting full-text and phrase queries with ranking. Covers aiohttp concurrency, robots.txt compliance, BFS/DFS crawl strategies, and building search infrastructure from scratch.

Python 3.12aiohttpBeautifulSoup4RedisFastAPIasyncioDocker
View on GitHub

Predictive Churn Analytics Dashboard

A machine learning pipeline that ingests customer event data, engineers features with Pandas and Polars, trains an XGBoost churn prediction model, exposes SHAP explainability scores via a FastAPI endpoint, and visualises predictions and feature importance in an interactive Streamlit dashboard. Covers end-to-end MLOps from feature engineering to explainable production inference.

Python 3.12XGBoostscikit-learnSHAPPolarsFastAPIStreamlitMLflow
View on GitHub

Advanced Projects

Multi-Agent AI Research Assistant

A CrewAI/LangGraph multi-agent system where specialised agents (researcher, writer, critic, fact-checker) collaborate to produce long-form research reports from a given topic. Each agent uses different tools (web search, vector retrieval, code execution) and the orchestrator manages agent-to-agent communication, task delegation, and output quality. A flagship agentic AI engineering project.

Python 3.12CrewAILangGraphLangChainOpenAITavily Search APIFastAPI
View on GitHub

LLM-Powered Coding Assistant (Local)

A self-hosted coding assistant that runs a local LLM via Ollama, integrates with VS Code via a language server extension, provides autocomplete, docstring generation, test scaffolding, and code explanation, and logs all interactions to a local SQLite database for personalisation. Covers local LLM deployment, LSP protocol, streaming inference, and building developer tooling around AI.

Python 3.12OllamaLangChainFastAPISQLiteLSPDocker
View on GitHub

Vector Search & Semantic Recommendation Engine

A production recommendation system that embeds product descriptions and user interaction history into vector representations, stores them in Pinecone or Weaviate, and serves personalised recommendations via a FastAPI endpoint with sub-10ms latency. Implements hybrid search (dense + sparse BM25), A/B testing infrastructure, and click-through rate logging.

Python 3.12OpenAI EmbeddingsPineconeFastAPIRedisPostgreSQLDocker
View on GitHub

Real-Time Fraud Detection System

A streaming fraud detection pipeline that consumes financial transaction events from Kafka, runs them through a trained LightGBM anomaly detector in under 5ms, writes results to PostgreSQL, and displays a live fraud alert dashboard in Streamlit with drilldown per transaction. Covers streaming ML inference, feature stores, model retraining pipelines, and production alert systems.

Python 3.12KafkaLightGBMFastAPIPostgreSQLRedisStreamlitMLflow
View on GitHub

Distributed Web Scraping Platform

A horizontally scalable scraping platform with a FastAPI control plane, Celery workers deployed on multiple nodes, a Playwright-based JavaScript-rendering engine, proxy rotation middleware, a structured data lake in S3, and a Streamlit monitoring dashboard. Supports 100K+ pages/day. Covers distributed systems, stateful crawl queues, and production scraping architecture.

Python 3.12FastAPICeleryPlaywrightRedisAWS S3PostgreSQLDocker
View on GitHub

AI-Powered Resume Screening System

An HR automation platform where recruiters upload job descriptions and candidate CVs, an LLM scores match quality across configurable dimensions (skills, experience, culture), ranks candidates, generates structured evaluation reports, and surfaces key gaps. Implements LangChain structured output, async batch processing with Celery, and a Django admin interface for HR teams.

Python 3.12LangChainOpenAIDjangoCeleryPostgreSQLPydanticDocker
View on GitHub

Kubernetes-Deployed Microservices Platform

A cloud-native SaaS application decomposed into FastAPI microservices (auth, billing, notifications, core API), each containerised, deployed to a local Kubernetes cluster via Helm charts, with inter-service communication over gRPC, a centralised API gateway via Traefik, and end-to-end observability with Prometheus + Grafana + Jaeger tracing. The definitive Python cloud-native architecture reference.

Python 3.12FastAPIgRPCKubernetesHelmTraefikPrometheusGrafanaJaeger
View on GitHub

LLM Evaluation & Testing Framework

A testing platform for LLM-based applications that runs automated evaluation suites (factuality, faithfulness, toxicity, latency, cost) against multiple models and prompt versions, stores results in a PostgreSQL experiment tracker, visualises regressions in a Streamlit dashboard, and integrates with CI/CD pipelines via a pytest plugin. The foundation of responsible LLM production deployment.

Python 3.12LangChainOpenAIpytestPostgreSQLStreamlitMLflowDocker
View on GitHub

Real-Time Collaborative Code Editor Backend

A WebSocket-based backend for a collaborative code editor implementing CRDTs (Conflict-free Replicated Data Types) for concurrent text editing, operational transforms for cursor synchronisation, sandboxed code execution via Docker containers, and persistent session history in Redis and PostgreSQL. Covers the hardest problems in collaborative software: distributed consistency and safe code execution.

Python 3.12FastAPIWebSocketsRedisPostgreSQLDocker SDKasyncio
View on GitHub

ML Feature Store & Model Registry

A production MLOps platform with a centralised feature store (batch + online features via Redis), experiment tracking with MLflow, model versioning and staging promotion, A/B testing infrastructure for shadow deployments, and a FastAPI model serving layer with automatic canary rollout. Demonstrates full-cycle MLOps engineering from feature engineering to production observability.

Python 3.12MLflowFastAPIRedisPostgreSQLFeastPrometheusGrafanaDocker
View on GitHub

Autonomous AI Agent with Tool Use

A LangGraph-based autonomous agent that can browse the web, execute Python code in a sandbox, query databases, call external APIs, and maintain long-term memory across sessions using a vector store. Implements a reflection loop where the agent self-critiques and improves its outputs, with a FastAPI control plane for task submission and status tracking.

Python 3.12LangGraphLangChainOpenAIChromaDBFastAPIDockerPlaywright
View on GitHub

Data Lakehouse with dbt & Polars

A modern data lakehouse architecture that ingests raw data from multiple sources (APIs, databases, S3), stores it as Parquet files in a Delta Lake, applies dbt transformation models, orchestrates the pipeline with Prefect, and serves analytical queries via DuckDB with a Streamlit BI dashboard. Covers modern data stack architecture, columnar storage, and transformation-as-code.

Python 3.12PolarsdbtDuckDBDelta LakePrefectAWS S3StreamlitDocker
View on GitHub

GraphQL API with Strawberry & FastAPI

A production GraphQL API built with Strawberry Python that exposes a social network graph (users, posts, follows, feeds), implements DataLoader for N+1 query prevention, subscriptions for real-time feed updates via WebSocket, cursor-based pagination, and schema-first development with automated tests. A definitive Python GraphQL architecture reference.

Python 3.12FastAPIStrawberry GraphQLSQLAlchemyPostgreSQLRedisDocker
View on GitHub

AI Document Processing Pipeline

An intelligent document processing system that ingests invoices, contracts, and forms via a FastAPI upload endpoint, uses a vision LLM to extract structured data fields, validates against configurable JSON Schema rules, routes exceptions to a human-review queue, and stores clean structured data in PostgreSQL. Covers multimodal LLMs, structured output parsing, human-in-the-loop workflows, and document intelligence engineering.

Python 3.12FastAPIOpenAI VisionLangChainPydanticPostgreSQLCeleryDocker
View on GitHub

Distributed Load Testing Platform

A self-hosted load testing platform inspired by Locust where distributed worker nodes execute configurable test scenarios against target APIs, aggregate real-time metrics (RPS, latency percentiles, error rates) into a time-series database, detect performance regressions automatically, and render live dashboards in Grafana. A systems engineering project combining async networking, distributed coordination, and performance analysis.

Python 3.12LocustFastAPIRedisPostgreSQLPrometheusGrafanaDockerasyncio
View on GitHub

Browse All Python Project Ideas

See 50+ Python projects with filters by difficulty, domain, and build time.

Explore Python Projects →

Tips for Building Projects That Get You Hired

  1. 1

    Pick projects that show Python's strengths — automation, data processing, APIs, and AI integrations are what hiring managers look for in Python developers.

  2. 2

    Deploy every project — a Streamlit app or FastAPI endpoint on Railway or Render is infinitely stronger than a local script on GitHub.

  3. 3

    Write clean, documented code — use docstrings, type hints, and a requirements.txt. Interviewers often open the repo live during the interview.

  4. 4

    Use real data — scrape it, pull it from a public API, or download from Kaggle. Projects using real-world data always tell a better story.

  5. 5

    Add a README with a demo GIF or screenshot, setup instructions, and one-line summary of what problem it solves and for whom.

Not sure which Python project to build?

BuildIdeas generates 3 personalized Python project ideas based on your exact stack and experience level — with week-by-week roadmaps and interview prep built in.

Generate My Python Project