System Architecture
The Aura project represents a shift toward modular open-source hardware. This documentation provides a technical foundation for building, deploying, and maintaining Aura systems.
What Is Aura
Aura is an open-source AI wearable pendant. You wear it around your neck. It listens, sees, and thinks — so you don't have to pause your day to take notes or remember things.
It captures audio and images, sends them to an AI, and turns everything into summaries, transcripts, and action items — all accessible from your phone.
Built on the Omi ecosystem. Fully open source. Around $50–70 to build yourself.
How Aura Works
The execution pipeline flows simply:
- Aura's microphone captures audio continuously.
- The camera captures images at set intervals.
- Both are sent to the backend over Wi-Fi.
- The backend transcribes audio via Deepgram or Whisper.
- Images are understood via GPT-4o Vision or Moondream.
- Everything is summarized and stored.
- You see it all in the Omi app on your phone.
The ESP32-S3 handles capture and transmission. The backend handles all AI processing. Your phone is just the mobile interface.
Bill of Materials
What to buy and where to get it:
| Item | Where to Buy | Approx Cost |
|---|---|---|
| XIAO ESP32-S3 Sense | Seeed Studio / Amazon | $15–24 |
| 150mAh LiPo × 6 | Amazon | $12 |
| Wires | Amazon | $5 |
| 3D printed case | Print yourself or order online | $5–10 |
| USB-C cable | Anywhere | — |
| Total Cost | ~$50–70 |
You can print the case STL files from the Aura hardware folder on GitHub.
BOM Case Assembly
Download the STL files from the Aura Hardware folder on GitHub.
Recommended 3D Print Settings:
- Material: PLA or PETG
- Layer height: 0.2mm
- Infill: 20%
- Supports: Yes, where needed
After printing, mount the ESP32-S3 board, route the battery wires, and snap the case shut. The pendant loop is built into the design. If you don't have a 3D printer, you can use local makerspaces or online printing services like Craftcloud or PCBWay.
Hardware Overview
The XIAO ESP32-S3 Sense is the core. Camera and microphone are already integrated on the board — no extra modules needed.
| Component | Details |
|---|---|
| Microcontroller | XIAO ESP32-S3 Sense |
| Camera | OV2640 — built into ESP32-S3 Sense |
| Microphone | PDM — built into ESP32-S3 Sense |
| Battery | 6× 150mAh LiPo cells |
| Enclosure | Custom 3D printed case |
| Connectivity | Wi-Fi 2.4GHz + Bluetooth LE |
| Charging | USB-C |
Flashing the Firmware
Aura firmware runs on the XIAO ESP32-S3 Sense. It coordinates audio streams, BLE advertisement, Wi-Fi connections, and camera frames. You can install it using several methods depending on your development environment.
Method 1: UF2 Drag-and-Drop Flash (Easiest)
This method requires no tools or compiler environments.
- Connect your XIAO ESP32-S3 Sense to your computer using a USB-C data cable.
- Enter bootloader mode: Hold down the Boot button on the board, press and release the Reset button, then release the Boot button.
- A new USB disk drive named
"ESP32S3"will mount on your system. - Copy the pre-compiled
.uf2file from thefirmware/releases/folder in the repo and drop it directly onto the mounted drive. The device will auto-flash and reboot.
Method 2: PlatformIO (Recommended for developers)
PlatformIO provides complete control over compiling environments. From the repo root, run:
pip install platformio cd firmware # Compile and upload via standard environment platformio run -e seeed_xiao_esp32s3 --target upload # Open serial debug output platformio device monitor --baud 115200
Method 3: Arduino IDE
If flashing via the Arduino IDE, follow these settings carefully — compilation may succeed but the camera will fail at runtime if PSRAM is not configured correctly:
- Install Arduino IDE 2.x.
- Add the ESP32 board manager URL in
File → Preferences → Additional boards manager URLs:https://raw.githubusercontent.com/espressif/arduino-esp32/gh-pages/package_esp32_index.json - Go to
Tools → Board → Boards Manager→ searchesp32→ Install version 2.0.17. - Configure the board profile in the Tools menu — get these right or the camera won't work:
Tools → Board → XIAO_ESP32S3 Tools → PSRAM → OPI PSRAM ← Required. Camera fails without this. Tools → Port → your COM port
On Windows, if the COM port doesn't appear, install the CH340 driver and restart.
- Open
firmware/firmware.inoand click Upload. - Open Serial Monitor at 115200 baud. Expected output:
[AURA] Camera initialized [AURA] Microphone initialized [AURA] BLE advertising [AURA] Ready
Method 4: Arduino CLI
For headless or CI environments, use the Arduino CLI to compile and upload directly from the terminal:
# Add ESP32 board manager arduino-cli config add board_manager.additional_urls \ https://raw.githubusercontent.com/espressif/arduino-esp32/gh-pages/package_esp32_index.json arduino-cli core install esp32:esp32@2.0.17 # Check board ID arduino-cli board list # Windows 11 should show: esp32:esp32:XIAO_ESP32S3 # Compile and upload (replace COM5 with your port) arduino-cli compile --build-path build --output-dir dist \ -e -u -p COM5 -b esp32:esp32:XIAO_ESP32S3:PSRAM=opi
The Aura firmware depends on two external libraries for Opus audio codec support. Clone them into your Arduino libraries folder:
# Find your libraries folder arduino-cli config get directories.user # then add /libraries to that path git clone https://github.com/pschatzmann/arduino-libopus.git git clone https://github.com/pschatzmann/arduino-audio-tools.git
PlatformIO Build Environments
| Environment | Description | Use Case |
|---|---|---|
| seeed_xiao_esp32s3 | Standard build | Development |
| seeed_xiao_esp32s3_slow | Slower upload speed | Connection issues |
| uf2_release | Optimized release build | Production / best battery |
Setting Up the Backend
The Aura backend powers all AI capabilities, transcription pipelines, conversation processing, and cloud integrations. It acts as the orchestration layer between the physical ESP32-S3 hardware and deep model APIs.
System Requirements & Dependencies:
- Python 3.10 or 3.11 (tested primarily on Python 3.10.12 / 3.11.4)
- Google Cloud SDK & Firebase Command Line Tools
- System packages:
git,ffmpeg, andopus - Ngrok or Cloudflare Tunnel (for local development exposing)
Step 1: Google Cloud & Firebase Initialization
Aura uses Firebase Firestore for storing conversation keys and settings. Authenticate your terminal and assign the active developer project:
gcloud auth login gcloud config set project <your-project-id> gcloud auth application-default login --project <your-project-id>
Ensure the following APIs are enabled in the Google Cloud API Console: Cloud Resource Manager API, Firebase Management API, and Cloud Firestore API. Then, create the necessary composite indexes in Firestore:
| Collection ID | Fields Index Definition |
|---|---|
dev_api_keys |
user_id (Ascending) + created_at (Descending) |
mcp_api_keys |
user_id (Ascending) + created_at (Descending) |
Step 2: Backend Environment Variables Setup
Clone the codebase, enter the backend folder, and create your .env file based on the template:
git clone https://github.com/thesohamdatta/Aura-Wearable-AI.git cd Aura-Wearable-AI/backend cp .env.template .env
Configure the parameters inside your .env. You must provide:
OPENAI_API_KEY&DEEPGRAM_API_KEYPINECONE_API_KEY&PINECONE_INDEX_NAME(Index set to 1536 dimensions for OpenAI vectors)REDIS_DB_HOST,REDIS_DB_PORT, andREDIS_DB_PASSWORD(For memory cache caching)GOOGLE_APPLICATION_CREDENTIALS=google-credentials.json
Step 3: Installation & Server Boot
# Create and activate virtual environment python -m venv venv source venv/bin/activate # On Windows use: venv\Scripts\activate # Install requirements pip install -r requirements.txt # Start FastAPI server uvicorn main:app --reload --env-file .env --port 8000
Step 4: Exposing local server to device (Ngrok)
Start a secure tunnel on port 8000 to acquire a public HTTPS URL:
ngrok http 8000
Copy the generated HTTPS tunnel address and paste it as the API_BASE_URL in your app's environment configuration, making sure to append the trailing slash: API_BASE_URL=https://<subdomain>.ngrok-free.app/.
Setting Up the App
Aura ships a native Android companion app built with Kotlin. It pairs with the pendant over Bluetooth LE, lets you configure the backend URL, browse transcription history, and monitor battery status in real time.
Requirements:
- Android Studio Hedgehog or later
- Android SDK 26+ (Android 8.0 minimum)
- A physical Android device with Bluetooth LE support — BLE emulation is unreliable on emulators
Option A — Android Studio (Recommended):
- Clone the repo and open
app/android/in Android Studio. - Wait for the Gradle sync to complete.
- Connect your Android device via USB and enable USB Debugging in Developer Options.
- Click Run ▶.
Option B — Command Line:
cd app/android # Debug build ./gradlew assembleDebug # Install directly on connected device ./gradlew installDebug
APK output: app/build/outputs/apk/debug/app-debug.apk
Configuration:
After first launch, open the app settings and set your backend URL:
API_BASE_URL = https://your-ngrok-url.ngrok-free.app/
Include the trailing slash.
Pairing with Aura:
- Flash the firmware to the pendant (see Flashing Firmware above).
- Power on the Aura pendant.
- Open the app → go to Devices.
- Tap Pair new device → select AURA from the BLE scan list.
- Connection established — the pendant LED confirms the pairing.
App Troubleshooting
| Issue | Fix |
|---|---|
| Device not found in scan | Power cycle pendant, ensure BLE is enabled on phone |
| Gradle sync fails | Check Android SDK version, update Gradle plugin |
| APK installs but crashes | Verify API_BASE_URL is set correctly with trailing slash |
| No transcription appearing | Check backend is running and ngrok tunnel is active |
AI Providers Setup
Aura supports three distinct AI providers. Set your preferred choice in your environment (.env) file:
| Provider | Best For | Internet Required |
|---|---|---|
| Groq | Fast, real-time responses | Yes |
| OpenAI | Best quality, vision support | Yes |
| Ollama | Local, private, offline | No (Localhost) |
For Ollama (local private setup), set OLLAMA_URL=http://localhost:11434 in your backend .env.
Transcription Setup
Aura captures high-fidelity audio at 16kHz mono PDM from the built-in microphone on the ESP32-S3 Sense. Two transcription options are available:
| Option | Speed | Internet Required |
|---|---|---|
| Deepgram | Fast, real-time | Yes (console.deepgram.com) |
| Whisper (local) | Slower, accurate | No (runs via Ollama) |
Camera Configuration
You can tune the camera resolution, capture intervals, and compression settings inside your device's firmware.ino:
| Setting | Default Value | Available Options |
|---|---|---|
| Resolution | SVGA (800×600) | QQVGA → UXGA |
| Capture Interval | 30 seconds | Any (ms) |
| JPEG Quality | 12 | 0 (best) – 63 (worst) |
Lower resolution + higher interval = longer battery life. For vision AI, SVGA is the sweet spot — good enough for GPT-4o, small enough to transmit fast.
Battery & Power
Aura runs on 2× 250mAh Li-ion cells wired in parallel — 500mAh total at 3.7V nominal. Battery voltage is monitored via a resistor divider connected to GPIO2 (A1).
Battery + ──[R1: 169kΩ]──+──[R2: 110kΩ]── Battery -
|
GPIO2 (A1) (2.536:1 voltage divider)
Current Draw by Mode:
| Mode | Current Draw |
|---|---|
| Active (Camera + BLE) | ~80mA |
| Standby (BLE only) | ~40mA |
| Deep sleep | ~2mA |
Expected Runtime (500mAh):
| Usage Pattern | Runtime |
|---|---|
| Heavy — continuous capture | 6–7 hours |
| Normal — mixed active/standby | 8–10 hours |
| Light — occasional captures | 12–15 hours |
Voltage Reference:
| Voltage | Battery % | Status |
|---|---|---|
| 4.2–4.3V | 100% | Fully charged |
| 4.0–4.2V | 80–100% | Good |
| 3.8–4.0V | 20–80% | Moderate |
| 3.7–3.8V | 0–20% | Low — charge soon |
| <3.5V | Critical | Check hardware wiring |
Serial Battery Commands:
Connect at 115200 baud and type any of these commands:
| Command | Output |
|---|---|
| status | Voltage, battery %, BLE state |
| charging | 10 readings over 20s with charge status |
| runtime | Estimated hours remaining |
| chargetime | Time to 80% / 90% / 100% |
| monitor | Continuous 5s interval updates (any key stops) |
Charging & LED Status:
- Red LED — Charging in progress
- Green LED — Fully charged
- Typical charge time: 1–1.5 hours from a USB 2.0 port
Best Practices:
- Charge when battery voltage drops below 3.8V (20%)
- Never let voltage drop below 3.5V — cell damage threshold
- Charge at 10°C–40°C ambient temperature range
- Store at ~50% charge if unused for more than a week
- Increase
PHOTO_CAPTURE_INTERVAL_MSin firmware for all-day use - Enable deep sleep between captures for maximum runtime
OTA Firmware Updates
Aura supports over-the-air firmware updates via BLE + Wi-Fi, so you can update the pendant's firmware without connecting a USB cable. Wi-Fi credentials are delivered from the app over Bluetooth LE — the device only connects to Wi-Fi for the duration of the update, then returns to BLE-only mode.
OTA Update Flow:
- Open the Aura app and navigate to Device Settings → Firmware Update.
- The app sends Wi-Fi credentials to the pendant via
OTA_CONTROL_UUIDBLE characteristic. - Pendant connects to Wi-Fi using the provided credentials.
- App transmits the firmware download URL to the pendant.
- Pendant downloads and installs the new firmware image.
- Device reboots automatically with the new firmware loaded.
Note on Wi-Fi Credentials
Wi-Fi credentials sent during OTA are not stored permanently on the device. After the update completes and the device reboots, it returns to BLE-only advertising mode. Your credentials are never persisted on flash storage.
Memory & RAG
Aura's long-term memory system is built on a Retrieval-Augmented Generation (RAG) pipeline. Every conversation and captured moment is embedded as a high-dimensional vector and stored in Pinecone for semantic recall. When you ask Aura something, it retrieves the most relevant memories to ground the AI response — no hallucinations from a stale model snapshot.
Pipeline Architecture:
- Audio transcription arrives from Deepgram (or local Whisper) as plain text.
- The backend chunks the transcript into overlapping segments for context preservation.
- Each chunk is embedded using OpenAI's
text-embedding-3-smallmodel (1536 dimensions). - Vectors are upserted into your Pinecone index, tagged with timestamp and session metadata.
- On retrieval queries, the user's question is embedded identically and the top-k nearest vectors are fetched.
- Retrieved chunks are injected into the system prompt context window before the LLM call (Groq or GPT-4o).
Pinecone Index Setup:
# Create a new Pinecone index # Dimensions: 1536 (matches OpenAI text-embedding-3-small) # Metric: cosine # Name: set in .env as PINECONE_INDEX_NAME=aura-memories
Log in to app.pinecone.io, create an index with 1536 dimensions and cosine similarity, then add its name to your backend .env.
Redis Memory Cache:
Redis is used as a fast ephemeral cache layer for recent conversation context — reducing redundant Pinecone queries and improving response latency. Use Upstash for a free, serverless Redis instance:
REDIS_DB_HOST=your-upstash-endpoint REDIS_DB_PORT=6379 REDIS_DB_PASSWORD=your-upstash-password
Memory Stack Summary
| Layer | Technology | Purpose |
|---|---|---|
| Embedding | OpenAI text-embedding-3-small | Convert text → 1536-dim vectors |
| Long-term storage | Pinecone | Semantic vector search |
| Short-term cache | Redis / Upstash | Recent context, low latency |
| Structured data | Firebase Firestore | User settings, API keys, session log |
Privacy & Data
Aura is privacy-first by design. Nothing is sent anywhere by default without your explicit configuration. You control every API key and every endpoint.
| Feature | Local Option | Cloud Option |
|---|---|---|
| Transcription | Whisper (Ollama) | Deepgram |
| AI Processing | Ollama | Groq / OpenAI |
| Memory Storage | Self-hosted Firebase | Aura Cloud |
| Images | Your backend | Your backend |
Note: Recording in public spaces may be subject to local laws. Always inform others when recording.
Troubleshooting
Firmware & Hardware
| Problem | Fix |
|---|---|
| Camera init failed | Tools → PSRAM → OPI PSRAM → re-upload |
| COM port missing (Windows) | Install CH340 driver, restart |
| Device not appearing as USB drive | Re-enter bootloader mode, try a different USB cable (data cable required) |
| Build fails (PlatformIO) | Run pip install platformio, then pio run --target clean |
| Always shows 0% or 100% battery | Check voltage divider wiring on GPIO2 — verify R1=169kΩ, R2=110kΩ |
| BLE not pairing | Power cycle device, forget in phone Bluetooth settings, re-pair |
| Wi-Fi not connecting | Confirm 2.4GHz network — ESP32 does not support 5GHz |
Backend & API
No internet connection when loading models
Add to utils/stt/vad.py after the imports block:
import ssl ssl._create_default_https_context = ssl._create_unverified_context
Internal Server Error on Developer Settings page
Firestore composite indexes are missing. See Backend Setup → Step 1 for the required index definitions.
authentication errors from gcloud
Re-run application-default login:
gcloud auth application-default login --project <your-project-id>
No transcription appearing
Check the Deepgram API key in your .env file. Verify the backend server is running and that API_BASE_URL in the app includes a trailing slash.
Question not answered?
Contact support at: thesohamdatta@gmail.com