@audin.ai/operator-sdk
Version:
Headless browser SDK for the Audin operator softphone — make and receive calls over the Audin operator WebSockets.
284 lines (220 loc) • 11.1 kB
Markdown
# .ai/operator-sdk
Headless browser SDK for the **Audin operator softphone**. Make and receive
phone calls from a web app over the Audin operator WebSockets — the SDK handles
the microphone, audio transcoding, the signalling and media channels,
reconnection and heartbeats. **There is no UI**: you build the interface, the
SDK does the plumbing.
- Framework-agnostic, zero runtime dependencies.
- Ships as ESM and UMD with full TypeScript types.
- Your account credentials **never enter the browser** — the SDK obtains
short-lived session tokens through a `getToken` callback you provide.
---
## Install
```bash
npm install .ai/operator-sdk
```
Or drop the UMD bundle in via a `<script>` tag (global `AudinOperatorSDK`):
```html
<script src="https://unpkg.com/@audin.ai/operator-sdk/dist/audin-operator-sdk.umd.cjs"></script>
<script>
const op = new AudinOperatorSDK.AudinOperator({ /* … */ });
</script>
```
### Compatibility
The SDK communicates with the Audin operator service over a **versioned wire
protocol, currently `v1`**. The package follows [SemVer](https://semver.org):
patch and minor releases keep that protocol compatible, while a **breaking
change to the wire protocol ships as a MAJOR version bump** (e.g. `1.x → 2.0`).
Pin to a compatible MAJOR range (`^0.1` while pre-1.0) and read the
[CHANGELOG](./CHANGELOG.md) before upgrading across a MAJOR. Note that
pre-1.0, the public API may still evolve in minor releases.
---
## The token flow (read this first)
The SDK is given a callback, `getToken`, that returns a short-lived session
token. **You implement `getToken` to call your own backend**, which holds your
Audin Account API Key and proxies to the Audin API:
```
browser (SDK) ──getToken()──▶ your backend ──X-API-Key──▶ Audin API
POST /operator-sessions/token
browser (SDK) ◀──{ token }─── your backend ◀──{ token }──
```
Your backend endpoint (example):
```ts
// On YOUR server — the API key stays here, never in the browser.
app.post("/api/operator/token", async (req, res) => {
const r = await fetch("https://api.audin.ai/operator-sessions/token", {
method: "POST",
headers: {
"Content-Type": "application/json",
"X-API-Key": process.env.AUDIN_ACCOUNT_API_KEY, // server-side secret
},
body: JSON.stringify({
operatorRef: req.user.id, // your stable operator identifier
displayName: req.user.name,
phoneNumberIds: req.user.assignedPhoneNumberIds,
}),
});
const data = await r.json();
// Return at least { token }. (expiresAt is optional but recommended.)
res.json({ token: data.token, expiresAt: data.expiresAt });
});
```
The token is valid for about an hour; the SDK calls `getToken` again whenever it
needs a fresh one (on connect, on reconnect, and when opening a call's audio
leg), so always fetch a **new** token rather than caching an expired one.
---
## Quick start
```ts
import { AudinOperator } from "@audin.ai/operator-sdk";
const op = new AudinOperator({
coreUrl: "https://core.audin.ai",
getToken: async () => {
const r = await fetch("/api/operator/token", { method: "POST" });
return r.json(); // { token, expiresAt? }
},
});
// ── inbound ──────────────────────────────────────────────
op.on("incomingCall", (call) => {
console.log("ringing from", call.from);
// Show your own "accept / reject" UI, then:
call.accept(); // or call.reject();
});
op.on("callStarted", (call) => {
console.log("connected:", call.callSid, call.direction);
});
op.on("callEnded", (call) => {
console.log("ended:", call.callSid, "reason:", call.endReason);
});
op.on("error", (e) => console.error(e.code, e.message));
// ── pick a number ────────────────────────────────────────
// List the numbers this account owns (fetched via the SDK with the same
// session token as the WebSockets — no API key in the browser):
const numbers = await op.listPhoneNumbers();
// → [{ id: "pn_1", phoneNumber: "+390299999999", displayName: "Milano" }, ...]
const mine = numbers[0]; // e.g. the one the operator selected in your UI
// Go online on the number's id.
await op.goOnline([mine.id]);
// ── outbound ─────────────────────────────────────────────
// Use the same number's E.164 as the caller ID.
const call = await op.dial("+39021234567", { callerId: mine.phoneNumber });
call.mute(true);
call.mute(false);
call.sendDtmf("5"); // see note: not yet forwarded to the phone network
call.hangup();
// When the operator logs out / closes the app:
await op.goOffline();
```
---
## API
### `new AudinOperator(config)`
| Option | Type | Default | Notes |
|---|---|---|---|
| `coreUrl` | `string` | — | Audin operator service base URL. `http(s)` is upgraded to `ws(s)` internally. |
| `getToken` | `() => Promise<{ token: string; expiresAt?: string }>` | — | Fetches a fresh session token from **your** backend. |
| `heartbeatIntervalMs` | `number` | `25000` | Presence keep-alive interval. |
| `reconnectBackoffMs` | `number[]` | `[1000,2000,5000,10000,30000]` | Backoff schedule for presence reconnects. |
| `audioConstraints` | `MediaTrackConstraints` | echo cancel + noise suppress + AGC | Passed to `getUserMedia({ audio })`. |
| `logger` | `OperatorLogger` | `console` | Diagnostic sink. |
### Methods
- `listPhoneNumbers(): Promise<OperatorPhoneNumber[]>` — list the phone numbers
the account owns (`{ id, phoneNumber, displayName }`). Fetched via the SDK
using the same session token as the WebSockets (the Account API Key never
enters the browser). Use a number's `id` for `goOnline([...])` and its
`phoneNumber` (E.164) as the `callerId` for `dial`. On a persistent `401`
throws `OperatorRequestError` (`code: "UNAUTHORIZED"`); other failures throw
with `code: "REQUEST_FAILED"`.
- `goOnline(phoneNumberIds: string[]): Promise<void>` — connect the presence
channel and announce availability. Call again to change the number set.
- `goOffline(): Promise<void>` — drop availability, end any active call, close
the presence channel (stops auto-reconnect).
- `dial(to: string, { callerId }): Promise<OperatorCall>` — place an outbound
call. Resolves when the platform accepts and the audio bridge is opening.
- `get state: PresenceState` — `"offline" | "connecting" | "online" | "reconnecting"`.
- `get currentCall: OperatorCall | null`.
- `on / off / once(event, listener)` — typed event subscription; `on` returns
an unsubscribe function.
### Events
| Event | Payload | When |
|---|---|---|
| `presenceStateChanged` | `PresenceState` | presence channel state changes |
| `availabilityChanged` | `{ accepted: string[]; rejected: string[] }` | server confirms which numbers you went online on |
| `incomingCall` | `OperatorCall` | an inbound call is ringing |
| `callStarted` | `OperatorCall` | audio is established (after accept / dial) |
| `callEnded` | `OperatorCall` | a call terminated (inspect `endReason`) |
| `error` | `{ code, message, cause? }` | a non-fatal error |
### `OperatorCall`
```ts
interface OperatorCall {
readonly callSid: string;
readonly direction: "inbound" | "outbound";
readonly from?: string;
readonly to?: string;
readonly state: "ringing" | "connecting" | "active" | "ended";
readonly endReason?: CallEndReason;
readonly muted: boolean;
accept(): void; // answer an inbound offer (no-op unless ringing)
reject(): void; // decline an inbound offer (no-op unless ringing)
mute(on: boolean): void;
sendDtmf(digit: string): void; // "0"-"9", "*", "#" — see note below
hangup(): void;
}
```
> **`sendDtmf` is not yet supported end-to-end.** The digit is validated and
> sent as a control message on the call channel, but the server does not yet
> forward the tones onto the telephone network, so the remote party will not
> hear them today — it is effectively a functional no-op for the far end. The
> method (and its wire message) are kept so that enabling it server-side in a
> future release needs no SDK change. Do not rely on it for IVR navigation yet.
`endReason` is one of: `hangup`, `remote_hangup`, `taken_by_other`, `rejected`,
`no_answer`, `failed`, `offline`.
---
## Concurrency (MVP)
This release handles **one active call at a time**. While a call is live:
- an incoming offer is **automatically declined** (you won't get an
`incomingCall` event for it), and
- `dial()` **rejects** with an error.
This keeps the audio graph and the state machine simple. Multi-line support can
be added later without changing this public API.
---
## How the audio works (for the curious)
You don't need to know any of this to use the SDK, but for completeness:
- The microphone is captured with `getUserMedia` and fed into a Web Audio
`AudioContext`.
- An `AudioWorklet` runs on the audio thread and does the format conversion:
on capture it downsamples from the context rate (typically 48 kHz) to 8 kHz
and encodes **G.711 μ-law**; on playback it decodes μ-law and upsamples back
to the context rate. Resampling is linear interpolation.
- Encoded audio is streamed as binary frames over the call's audio WebSocket;
audio from the far end arrives the same way and is played back.
- Mute, DTMF and hangup are small JSON control messages on the same socket.
(DTMF is sent but not yet forwarded to the phone network — see the
`sendDtmf` note above.)
The μ-law codec and the resampling helpers are also exported from the package
root (`encodeMuLaw`, `decodeMuLaw`, `resampleLinear`, …) for advanced
integrators who want to build their own audio path.
---
## Browser support
Requires a modern browser with `AudioWorklet`, `getUserMedia` and `WebSocket`
(all current Chromium, Firefox and Safari releases). The page must be served
over **HTTPS** (or `localhost`) for microphone access. The first call may need a
user gesture to resume the `AudioContext` under autoplay policies.
---
## Manual-test demo
A single static page for hands-on testing lives at
[`examples/operator-demo.html`](./examples/operator-demo.html). It loads the
UMD build and gives you a minimal operator console (go online, accept/reject
incoming calls, dial, mute, hang up, event log).
```bash
# 1. Build so the UMD bundle exists (the page loads ../dist/audin-operator-sdk.umd.cjs)
npm run build
# 2. Serve over http://localhost (microphone needs a secure context; file:// won't work)
npx serve .
# then open http://localhost:3000/examples/operator-demo.html
```
The demo needs a backend **you** control that exposes a token endpoint (see
[The token flow](#the-token-flow-read-this-first)) — paste its URL into the
"token endpoint" field. The Account API Key never enters the page. It is for
manual testing only, not a production UI.
---
## License
MIT — see [LICENSE](./LICENSE).