System Architecture

The Aura project represents a shift toward modular open-source hardware. This documentation provides a technical foundation for building, deploying, and maintaining Aura systems.

What Is Aura

Aura is an open-source AI wearable pendant. You wear it around your neck. It listens, sees, and thinks — so you don't have to pause your day to take notes or remember things.

It captures audio and images, sends them to an AI, and turns everything into summaries, transcripts, and action items — all accessible from your phone.

Built on the Omi ecosystem. Fully open source. Around $50–70 to build yourself.

How Aura Works

The execution pipeline flows simply:

  • Aura's microphone captures audio continuously.
  • The camera captures images at set intervals.
  • Both are sent to the backend over Wi-Fi.
  • The backend transcribes audio via Deepgram or Whisper.
  • Images are understood via GPT-4o Vision or Moondream.
  • Everything is summarized and stored.
  • You see it all in the Omi app on your phone.

The ESP32-S3 handles capture and transmission. The backend handles all AI processing. Your phone is just the mobile interface.

Bill of Materials

What to buy and where to get it:

Item Where to Buy Approx Cost
XIAO ESP32-S3 Sense Seeed Studio / Amazon $15–24
150mAh LiPo × 6 Amazon $12
Wires Amazon $5
3D printed case Print yourself or order online $5–10
USB-C cable Anywhere
Total Cost ~$50–70

You can print the case STL files from the Aura hardware folder on GitHub.

BOM Case Assembly

Download the STL files from the Aura Hardware folder on GitHub.

Recommended 3D Print Settings:

  • Material: PLA or PETG
  • Layer height: 0.2mm
  • Infill: 20%
  • Supports: Yes, where needed

After printing, mount the ESP32-S3 board, route the battery wires, and snap the case shut. The pendant loop is built into the design. If you don't have a 3D printer, you can use local makerspaces or online printing services like Craftcloud or PCBWay.

Hardware Overview

The XIAO ESP32-S3 Sense is the core. Camera and microphone are already integrated on the board — no extra modules needed.

Component Details
Microcontroller XIAO ESP32-S3 Sense
Camera OV2640 — built into ESP32-S3 Sense
Microphone PDM — built into ESP32-S3 Sense
Battery 6× 150mAh LiPo cells
Enclosure Custom 3D printed case
Connectivity Wi-Fi 2.4GHz + Bluetooth LE
Charging USB-C

Flashing the Firmware

Aura firmware runs on the XIAO ESP32-S3 Sense. It coordinates audio streams, BLE advertisement, Wi-Fi connections, and camera frames. You can install it using several methods depending on your development environment.

Method 1: UF2 Drag-and-Drop Flash (Easiest)

This method requires no tools or compiler environments.

  1. Connect your XIAO ESP32-S3 Sense to your computer using a USB-C data cable.
  2. Enter bootloader mode: Hold down the Boot button on the board, press and release the Reset button, then release the Boot button.
  3. A new USB disk drive named "ESP32S3" will mount on your system.
  4. Copy the pre-compiled .uf2 file from the firmware/releases/ folder in the repo and drop it directly onto the mounted drive. The device will auto-flash and reboot.

Method 2: PlatformIO (Recommended for developers)

PlatformIO provides complete control over compiling environments. From the repo root, run:

pip install platformio
cd firmware

# Compile and upload via standard environment
platformio run -e seeed_xiao_esp32s3 --target upload

# Open serial debug output
platformio device monitor --baud 115200

Method 3: Arduino IDE

If flashing via the Arduino IDE, follow these settings carefully — compilation may succeed but the camera will fail at runtime if PSRAM is not configured correctly:

  1. Install Arduino IDE 2.x.
  2. Add the ESP32 board manager URL in File → Preferences → Additional boards manager URLs: https://raw.githubusercontent.com/espressif/arduino-esp32/gh-pages/package_esp32_index.json
  3. Go to Tools → Board → Boards Manager → search esp32 → Install version 2.0.17.
  4. Configure the board profile in the Tools menu — get these right or the camera won't work:
    Tools → Board  →  XIAO_ESP32S3
    Tools → PSRAM  →  OPI PSRAM        ← Required. Camera fails without this.
    Tools → Port   →  your COM port

    On Windows, if the COM port doesn't appear, install the CH340 driver and restart.

  5. Open firmware/firmware.ino and click Upload.
  6. Open Serial Monitor at 115200 baud. Expected output:
    [AURA] Camera initialized
    [AURA] Microphone initialized
    [AURA] BLE advertising
    [AURA] Ready

Method 4: Arduino CLI

For headless or CI environments, use the Arduino CLI to compile and upload directly from the terminal:

# Add ESP32 board manager
arduino-cli config add board_manager.additional_urls \
  https://raw.githubusercontent.com/espressif/arduino-esp32/gh-pages/package_esp32_index.json

arduino-cli core install esp32:esp32@2.0.17

# Check board ID
arduino-cli board list
# Windows 11 should show: esp32:esp32:XIAO_ESP32S3

# Compile and upload (replace COM5 with your port)
arduino-cli compile --build-path build --output-dir dist \
  -e -u -p COM5 -b esp32:esp32:XIAO_ESP32S3:PSRAM=opi

The Aura firmware depends on two external libraries for Opus audio codec support. Clone them into your Arduino libraries folder:

# Find your libraries folder
arduino-cli config get directories.user
# then add /libraries to that path

git clone https://github.com/pschatzmann/arduino-libopus.git
git clone https://github.com/pschatzmann/arduino-audio-tools.git

PlatformIO Build Environments

Environment Description Use Case
seeed_xiao_esp32s3 Standard build Development
seeed_xiao_esp32s3_slow Slower upload speed Connection issues
uf2_release Optimized release build Production / best battery

Setting Up the Backend

The Aura backend powers all AI capabilities, transcription pipelines, conversation processing, and cloud integrations. It acts as the orchestration layer between the physical ESP32-S3 hardware and deep model APIs.

System Requirements & Dependencies:

  • Python 3.10 or 3.11 (tested primarily on Python 3.10.12 / 3.11.4)
  • Google Cloud SDK & Firebase Command Line Tools
  • System packages: git, ffmpeg, and opus
  • Ngrok or Cloudflare Tunnel (for local development exposing)

Step 1: Google Cloud & Firebase Initialization

Aura uses Firebase Firestore for storing conversation keys and settings. Authenticate your terminal and assign the active developer project:

gcloud auth login
gcloud config set project <your-project-id>
gcloud auth application-default login --project <your-project-id>

Ensure the following APIs are enabled in the Google Cloud API Console: Cloud Resource Manager API, Firebase Management API, and Cloud Firestore API. Then, create the necessary composite indexes in Firestore:

Collection ID Fields Index Definition
dev_api_keys user_id (Ascending) + created_at (Descending)
mcp_api_keys user_id (Ascending) + created_at (Descending)

Step 2: Backend Environment Variables Setup

Clone the codebase, enter the backend folder, and create your .env file based on the template:

git clone https://github.com/thesohamdatta/Aura-Wearable-AI.git
cd Aura-Wearable-AI/backend
cp .env.template .env

Configure the parameters inside your .env. You must provide:

  • OPENAI_API_KEY & DEEPGRAM_API_KEY
  • PINECONE_API_KEY & PINECONE_INDEX_NAME (Index set to 1536 dimensions for OpenAI vectors)
  • REDIS_DB_HOST, REDIS_DB_PORT, and REDIS_DB_PASSWORD (For memory cache caching)
  • GOOGLE_APPLICATION_CREDENTIALS=google-credentials.json

Step 3: Installation & Server Boot

# Create and activate virtual environment
python -m venv venv
source venv/bin/activate  # On Windows use: venv\Scripts\activate

# Install requirements
pip install -r requirements.txt

# Start FastAPI server
uvicorn main:app --reload --env-file .env --port 8000

Step 4: Exposing local server to device (Ngrok)

Start a secure tunnel on port 8000 to acquire a public HTTPS URL:

ngrok http 8000

Copy the generated HTTPS tunnel address and paste it as the API_BASE_URL in your app's environment configuration, making sure to append the trailing slash: API_BASE_URL=https://<subdomain>.ngrok-free.app/.

Setting Up the App

Aura ships a native Android companion app built with Kotlin. It pairs with the pendant over Bluetooth LE, lets you configure the backend URL, browse transcription history, and monitor battery status in real time.

Requirements:

  • Android Studio Hedgehog or later
  • Android SDK 26+ (Android 8.0 minimum)
  • A physical Android device with Bluetooth LE support — BLE emulation is unreliable on emulators

Option A — Android Studio (Recommended):

  1. Clone the repo and open app/android/ in Android Studio.
  2. Wait for the Gradle sync to complete.
  3. Connect your Android device via USB and enable USB Debugging in Developer Options.
  4. Click Run ▶.

Option B — Command Line:

cd app/android

# Debug build
./gradlew assembleDebug

# Install directly on connected device
./gradlew installDebug

APK output: app/build/outputs/apk/debug/app-debug.apk

Configuration:

After first launch, open the app settings and set your backend URL:

API_BASE_URL = https://your-ngrok-url.ngrok-free.app/

Include the trailing slash.

Pairing with Aura:

  1. Flash the firmware to the pendant (see Flashing Firmware above).
  2. Power on the Aura pendant.
  3. Open the app → go to Devices.
  4. Tap Pair new device → select AURA from the BLE scan list.
  5. Connection established — the pendant LED confirms the pairing.

App Troubleshooting

Issue Fix
Device not found in scan Power cycle pendant, ensure BLE is enabled on phone
Gradle sync fails Check Android SDK version, update Gradle plugin
APK installs but crashes Verify API_BASE_URL is set correctly with trailing slash
No transcription appearing Check backend is running and ngrok tunnel is active

AI Providers Setup

Aura supports three distinct AI providers. Set your preferred choice in your environment (.env) file:

Provider Best For Internet Required
Groq Fast, real-time responses Yes
OpenAI Best quality, vision support Yes
Ollama Local, private, offline No (Localhost)

For Ollama (local private setup), set OLLAMA_URL=http://localhost:11434 in your backend .env.

Transcription Setup

Aura captures high-fidelity audio at 16kHz mono PDM from the built-in microphone on the ESP32-S3 Sense. Two transcription options are available:

Option Speed Internet Required
Deepgram Fast, real-time Yes (console.deepgram.com)
Whisper (local) Slower, accurate No (runs via Ollama)

Camera Configuration

You can tune the camera resolution, capture intervals, and compression settings inside your device's firmware.ino:

Setting Default Value Available Options
Resolution SVGA (800×600) QQVGA → UXGA
Capture Interval 30 seconds Any (ms)
JPEG Quality 12 0 (best) – 63 (worst)

Lower resolution + higher interval = longer battery life. For vision AI, SVGA is the sweet spot — good enough for GPT-4o, small enough to transmit fast.

Battery & Power

Aura runs on 2× 250mAh Li-ion cells wired in parallel — 500mAh total at 3.7V nominal. Battery voltage is monitored via a resistor divider connected to GPIO2 (A1).

Battery + ──[R1: 169kΩ]──+──[R2: 110kΩ]── Battery -
                          |
                        GPIO2 (A1)   (2.536:1 voltage divider)

Current Draw by Mode:

Mode Current Draw
Active (Camera + BLE) ~80mA
Standby (BLE only) ~40mA
Deep sleep ~2mA

Expected Runtime (500mAh):

Usage Pattern Runtime
Heavy — continuous capture 6–7 hours
Normal — mixed active/standby 8–10 hours
Light — occasional captures 12–15 hours

Voltage Reference:

Voltage Battery % Status
4.2–4.3V 100% Fully charged
4.0–4.2V 80–100% Good
3.8–4.0V 20–80% Moderate
3.7–3.8V 0–20% Low — charge soon
<3.5V Critical Check hardware wiring

Serial Battery Commands:

Connect at 115200 baud and type any of these commands:

Command Output
status Voltage, battery %, BLE state
charging 10 readings over 20s with charge status
runtime Estimated hours remaining
chargetime Time to 80% / 90% / 100%
monitor Continuous 5s interval updates (any key stops)

Charging & LED Status:

  • Red LED — Charging in progress
  • Green LED — Fully charged
  • Typical charge time: 1–1.5 hours from a USB 2.0 port

Best Practices:

  • Charge when battery voltage drops below 3.8V (20%)
  • Never let voltage drop below 3.5V — cell damage threshold
  • Charge at 10°C–40°C ambient temperature range
  • Store at ~50% charge if unused for more than a week
  • Increase PHOTO_CAPTURE_INTERVAL_MS in firmware for all-day use
  • Enable deep sleep between captures for maximum runtime

OTA Firmware Updates

Aura supports over-the-air firmware updates via BLE + Wi-Fi, so you can update the pendant's firmware without connecting a USB cable. Wi-Fi credentials are delivered from the app over Bluetooth LE — the device only connects to Wi-Fi for the duration of the update, then returns to BLE-only mode.

OTA Update Flow:

  1. Open the Aura app and navigate to Device Settings → Firmware Update.
  2. The app sends Wi-Fi credentials to the pendant via OTA_CONTROL_UUID BLE characteristic.
  3. Pendant connects to Wi-Fi using the provided credentials.
  4. App transmits the firmware download URL to the pendant.
  5. Pendant downloads and installs the new firmware image.
  6. Device reboots automatically with the new firmware loaded.

Note on Wi-Fi Credentials

Wi-Fi credentials sent during OTA are not stored permanently on the device. After the update completes and the device reboots, it returns to BLE-only advertising mode. Your credentials are never persisted on flash storage.

Memory & RAG

Aura's long-term memory system is built on a Retrieval-Augmented Generation (RAG) pipeline. Every conversation and captured moment is embedded as a high-dimensional vector and stored in Pinecone for semantic recall. When you ask Aura something, it retrieves the most relevant memories to ground the AI response — no hallucinations from a stale model snapshot.

Pipeline Architecture:

  1. Audio transcription arrives from Deepgram (or local Whisper) as plain text.
  2. The backend chunks the transcript into overlapping segments for context preservation.
  3. Each chunk is embedded using OpenAI's text-embedding-3-small model (1536 dimensions).
  4. Vectors are upserted into your Pinecone index, tagged with timestamp and session metadata.
  5. On retrieval queries, the user's question is embedded identically and the top-k nearest vectors are fetched.
  6. Retrieved chunks are injected into the system prompt context window before the LLM call (Groq or GPT-4o).

Pinecone Index Setup:

# Create a new Pinecone index
# Dimensions: 1536 (matches OpenAI text-embedding-3-small)
# Metric: cosine
# Name: set in .env as PINECONE_INDEX_NAME=aura-memories

Log in to app.pinecone.io, create an index with 1536 dimensions and cosine similarity, then add its name to your backend .env.

Redis Memory Cache:

Redis is used as a fast ephemeral cache layer for recent conversation context — reducing redundant Pinecone queries and improving response latency. Use Upstash for a free, serverless Redis instance:

REDIS_DB_HOST=your-upstash-endpoint
REDIS_DB_PORT=6379
REDIS_DB_PASSWORD=your-upstash-password

Memory Stack Summary

Layer Technology Purpose
Embedding OpenAI text-embedding-3-small Convert text → 1536-dim vectors
Long-term storage Pinecone Semantic vector search
Short-term cache Redis / Upstash Recent context, low latency
Structured data Firebase Firestore User settings, API keys, session log

Privacy & Data

Aura is privacy-first by design. Nothing is sent anywhere by default without your explicit configuration. You control every API key and every endpoint.

Feature Local Option Cloud Option
Transcription Whisper (Ollama) Deepgram
AI Processing Ollama Groq / OpenAI
Memory Storage Self-hosted Firebase Aura Cloud
Images Your backend Your backend

Note: Recording in public spaces may be subject to local laws. Always inform others when recording.

Troubleshooting

Firmware & Hardware

Problem Fix
Camera init failed Tools → PSRAM → OPI PSRAM → re-upload
COM port missing (Windows) Install CH340 driver, restart
Device not appearing as USB drive Re-enter bootloader mode, try a different USB cable (data cable required)
Build fails (PlatformIO) Run pip install platformio, then pio run --target clean
Always shows 0% or 100% battery Check voltage divider wiring on GPIO2 — verify R1=169kΩ, R2=110kΩ
BLE not pairing Power cycle device, forget in phone Bluetooth settings, re-pair
Wi-Fi not connecting Confirm 2.4GHz network — ESP32 does not support 5GHz

Backend & API

No internet connection when loading models

Add to utils/stt/vad.py after the imports block:

import ssl
ssl._create_default_https_context = ssl._create_unverified_context

Internal Server Error on Developer Settings page

Firestore composite indexes are missing. See Backend Setup → Step 1 for the required index definitions.

authentication errors from gcloud

Re-run application-default login:

gcloud auth application-default login --project <your-project-id>

No transcription appearing

Check the Deepgram API key in your .env file. Verify the backend server is running and that API_BASE_URL in the app includes a trailing slash.

Question not answered?

Contact support at: thesohamdatta@gmail.com