Signal Protocol Implementation

Overview

whatsapp-rust implements the Signal Protocol for end-to-end encryption of both one-on-one and group messages. The implementation is based on Signal’s libsignal library, adapted for WhatsApp’s specific protocol requirements.

The Signal Protocol implementation handles cryptographic primitives. Any modifications to this code require expert-level understanding of cryptographic protocols to avoid security vulnerabilities.

Architecture

The Signal Protocol implementation is split across two main locations:

wacore/libsignal/ - Platform-agnostic Signal Protocol core (Rust port of libsignal)
src/store/signal*.rs - WhatsApp-specific storage integration with Diesel/SQLite

Key Components

wacore/libsignal/src/
├── protocol/
│   ├── session_cipher.rs      # Encryption/decryption for 1:1 messages
│   ├── group_cipher.rs        # Encryption/decryption for group messages
│   ├── ratchet.rs            # Double Ratchet implementation
│   ├── sender_keys.rs        # Sender Key protocol for groups
│   └── state/                # Session state management
└── crypto/
    ├── aes_cbc.rs            # AES-256-CBC for message content
    ├── aes_gcm.rs            # AES-GCM for media encryption
    └── hash.rs               # HKDF and HMAC primitives

Double ratchet protocol

The Double Ratchet algorithm provides forward secrecy and post-compromise security for 1:1 messages.

Session Initialization

Two participants initialize a session using Diffie-Hellman key exchange:

// Alice initiates the session (sender)
pub fn initialize_alice_session<R: Rng + CryptoRng>(
    parameters: &AliceSignalProtocolParameters,
    csprng: &mut R,
) -> Result<SessionState>

// Bob receives the session (recipient)
pub fn initialize_bob_session(
    parameters: &BobSignalProtocolParameters
) -> Result<SessionState>

Key Derivation:

Compute shared secrets from ephemeral key exchanges

Derive root key and chain key using HKDF-SHA256:

HKDF(discontinuity_bytes || DH1 || DH2 || DH3 [|| DH4])
→ (RootKey[32], ChainKey[32], PQRKey[32])

Initialize sender and receiver chains

Location: wacore/libsignal/src/protocol/ratchet.rs:41-172

Message Encryption

Each message advances the sender chain and derives ephemeral message keys:

// From wacore/libsignal/src/protocol/session_cipher.rs:65-183
pub async fn message_encrypt(
    ptext: &[u8],
    remote_address: &ProtocolAddress,
    session_store: &mut dyn SessionStore,
    identity_store: &mut dyn IdentityKeyStore,
) -> Result<CiphertextMessage>

Process:

Load current session state

Get sender chain key and derive message keys:

let (message_keys_gen, next_chain_key) = chain_key.step_with_message_keys();
let message_keys = message_keys_gen.generate_keys();
// message_keys contains: cipher_key, mac_key, iv

Encrypt plaintext with AES-256-CBC:

aes_256_cbc_encrypt_into(ptext, message_keys.cipher_key(), 
                         message_keys.iv(), &mut buf)

Create SignalMessage with MAC for authentication
Advance chain key and save session state

Message Format:

SignalMessage: Standard encrypted message
PreKeySignalMessage: Includes prekey bundle for session establishment

Plaintext padding. Before encryption, the serialized wa::Message is padded with a uniform-random number of bytes in 1..=16 (the pad length is repeated as the byte value, matching WA Web’s rand % 16 + 1 and whatsmeow). v0.6 fixed a prior scheme that masked the length with & 0x0F, which skewed the distribution toward 15 and could never emit 16 — a subtle fingerprinting divergence from the official client. The receiver strips the padding by reading the final byte as the length.

Message Decryption

Decryption handles out-of-order delivery and tries multiple session states:

// From wacore/libsignal/src/protocol/session_cipher.rs:292-363
pub async fn message_decrypt_signal<R: Rng + CryptoRng>(
    ciphertext: &SignalMessage,
    remote_address: &ProtocolAddress,
    session_store: &mut dyn SessionStore,
    identity_store: &mut dyn IdentityKeyStore,
    csprng: &mut R,
) -> Result<Vec<u8>>

Process:

Try current session state first
If MAC verification fails, try previous (archived) sessions
Derive/retrieve message keys for the counter

Verify MAC:

ciphertext.verify_mac(&their_identity_key, &local_identity_key, 
                      message_keys.mac_key())

Decrypt with AES-256-CBC
Promote successful session to current if needed

The implementation optimizes memory by using take/restore patterns to avoid cloning session states during decryption attempts (see session_cipher.rs:495-619).

Chain key ratcheting

Message keys are derived from chain keys, which advance with each message:

pub struct ChainKey {
    key: [u8; 32],
    index: u32,
}

impl ChainKey {
    pub fn step_with_message_keys(self) -> Result<(MessageKeyGenerator, ChainKey)> {
        let message_key_gen = MessageKeyGenerator::new(self.key, self.index);
        let next_chain_key = self.next_chain_key()?;
        Ok((message_key_gen, next_chain_key))
    }
}

Location: wacore/libsignal/src/protocol/ratchet/keys.rs

Chain key overflow protection

The chain key index is a u32 that increments with each message. Without overflow protection, the index could silently wrap past u32::MAX (4,294,967,295) back to 0, creating a counter reuse vulnerability that breaks cryptographic guarantees (nonce reuse in message key derivation). Both 1:1 and group chain keys use checked_add() to return a typed error instead of wrapping:

// 1:1 chain keys (ratchet/keys.rs)
pub fn next_chain_key(&self) -> crate::protocol::Result<Self> {
    Ok(Self {
        key: self.calculate_base_material(Self::CHAIN_KEY_SEED),
        index: self.index.checked_add(1).ok_or_else(|| {
            SignalProtocolError::InvalidState(
                "next_chain_key",
                "chain key index overflow (u32::MAX)".to_string(),
            )
        })?,
    })
}

// Group sender chain keys (sender_keys.rs)
let new_iteration = self.iteration.checked_add(1).ok_or_else(|| {
    SignalProtocolError::InvalidState(
        "sender_chain_key_next",
        "Sender chain is too long".into(),
    )
})?;

A chain key reaching u32::MAX iterations indicates an abnormally long-lived session. In practice this should never occur — ratchet key rotations reset the chain counter with each new Diffie-Hellman exchange.

Location: wacore/libsignal/src/protocol/ratchet/keys.rs, wacore/libsignal/src/protocol/sender_keys.rs

Forward Jumps

The protocol tolerates out-of-order messages up to a limit:

const MAX_FORWARD_JUMPS: usize = 25000;

if jump > MAX_FORWARD_JUMPS {
    return Err(SignalProtocolError::InvalidMessage(
        original_message_type,
        "message from too far into the future",
    ));
}

Location: wacore/libsignal/src/protocol/session_cipher.rs:832-847

DM device fanout

When sending a direct message, the library resolves all known devices for both the recipient and your own account, then encrypts two different plaintexts for two categories of devices:

Recipient devices receive the actual message content
Own other devices (your other linked devices) receive a DeviceSentMessage wrapper containing the message plus the destination JID, so your other devices can display the sent message in the correct chat

Device resolution

The DM send path builds the full device list in a WA Web-compliant manner (matching WAWebSendUserMsgJob and WAWebDBDeviceListFanout):

Local registry first — the client checks the local device registry via get_devices_from_registry() for both the recipient and own account. A network fetch (get_user_devices) is only triggered on a cache miss, avoiding unnecessary LID-migration side effects.
Hosted device filtering — devices flagged as hosted (via is_hosted()) are filtered out, matching WA Web’s DBDeviceListFanout exclusion.
Sender device exclusion — the exact sender device is removed from the list so ensure_e2e_sessions never creates a self-session. This matches WA Web’s isMeDevice check in getFanOutList.
Self-DM deduplication — when sending to your own account, the recipient and own device lists overlap. A HashSet-based dedup pass (matching WA Web’s Map keyed by toString) removes duplicates.

// Build device list — local registry first, network on miss
let mut recipient_cached = self.get_devices_from_registry(&recipient_bare).await;
if recipient_cached.is_none() {
    let _ = self.get_user_devices(std::slice::from_ref(&to)).await;
    recipient_cached = self.get_devices_from_registry(&recipient_bare).await;
}

// Filter hosted devices, exclude sender, dedup for self-DMs
all_dm_jids.retain(|j| !j.is_hosted());
all_dm_jids.retain(|j| !is_sender);
// HashSet dedup for self-DM overlap

Device partitioning

The partition_dm_devices function splits all resolved devices into recipient and own groups, and excludes the exact sender device (the current device) entirely:

fn partition_dm_devices(
    all_devices: Vec<Jid>,
    own_jid: &Jid,
    own_lid: Option<&Jid>,
) -> (Vec<Jid>, Vec<Jid>)

Sender device exclusion

The exact sender device is identified by matching both the user and device ID against your phone number JID (PN) or your Linked Identity JID (LID):

fn is_exact_dm_sender_device(device_jid: &Jid, own_jid: &Jid, own_lid: Option<&Jid>) -> bool {
    (device_jid.is_same_user_as(own_jid) && device_jid.device == own_jid.device)
        || own_lid.is_some_and(|lid|
            device_jid.is_same_user_as(lid) && device_jid.device == lid.device
        )
}

Own device recognition

After excluding the sender device, the remaining devices are classified using matches_user_or_lid, which checks if a device JID belongs to the same user as either your PN or LID:

pub fn matches_user_or_lid(&self, user: &Jid, lid: Option<&Jid>) -> bool {
    self.is_same_user_as(user) || lid.is_some_and(|l| self.is_same_user_as(l))
}

This ensures that your own devices registered under your LID (common in multi-device setups) are correctly classified as “own” devices and receive the DeviceSentMessage plaintext — not the recipient plaintext. Without LID matching, your own LID-based devices would be misclassified as recipient devices, causing them to receive the wrong message format.

Both PN-based and LID-based devices must be checked because WhatsApp’s multi-device architecture uses both addressing schemes. A user’s devices may appear under either their phone number JID (@s.whatsapp.net) or their Linked Identity JID (@lid), depending on the device type and registration path.

PreparedDmStanza

prepare_dm_stanza returns a PreparedDmStanza struct containing the stanza node and the locally computed phash for server ACK validation:

pub struct PreparedDmStanza {
    pub node: Node,
    /// Locally computed phash from the sent device set. Not sent on the
    /// wire (WA Web only sends phash for groups). Used by the caller to
    /// compare against the server's ACK phash for device-list drift detection.
    pub phash: Option<String>,
}

The phash is computed from the actual sent device set (after partitioning, with the sender excluded) using MessageUtils::participant_list_hash(). Unlike group messages, the DM phash is not sent on the wire — WA Web only includes phash in the DeviceSentMessage for groups. The DM phash is used purely for local validation against the server’s ACK to detect device-list drift.

The DeviceSentMessage.phash field is set to None for DMs, matching WA Web’s behavior where only group DeviceSentMessage wrappers include a phash. The DM phash is computed and tracked separately by the caller.

Location: wacore/src/send.rs:675-820, src/send.rs

PN→LID session migration

WhatsApp’s multi-device architecture uses two addressing schemes: phone number JIDs (PN, @s.whatsapp.net) and Linked Identity JIDs (LID, @lid). WhatsApp Web always resolves PN→LID before any session operation via createSignalAddress(). whatsapp-rust mirrors this behavior — when a LID mapping is discovered for a phone number, any Signal sessions stored under the PN address are automatically migrated to the corresponding LID address.

Signal address resolution

Client::resolve_encryption_jid() mirrors WA Web’s SignalAddress.toString() (WAWeb/Signal/Address.js). It upgrades the JID’s server to its LID counterpart when a mapping is known, and otherwise returns the input unchanged:

Input `Server`	Resolved `Server` (mapping known)	No mapping
`Pn`	`Lid`	`Pn` (preserved)
`Hosted`	`HostedLid`	`Hosted` (preserved)
Any other	unchanged	unchanged

The device, agent, and integrator fields always round-trip — only the user (replaced with the LID user) and server change. This keeps Cloud API / Meta Business hosted devices on a hosted-flavored LID address rather than collapsing them into the standard @lid server, matching WA Web’s per-device session keying.

Why migration is needed

After pairing, the primary phone may initially establish sessions under a PN address. Once the LID mapping becomes known (from usync, incoming messages, or device notifications), the phone begins sending from the LID address. Without migration, the client holds a session under the PN address but receives messages addressed to the LID — causing SessionNotFound decryption failures.

Proactive migration at LID discovery

When a new LID-PN mapping is learned (via add_lid_pn_mapping), the client scans devices 0–99 for PN-keyed sessions and migrates them. All reads and writes go through the SignalStoreCache rather than the backend directly — this prevents reading stale data when the cache has unflushed mutations (e.g., after SKDM encryption ratcheted the session). The migrated state is flushed to the backend at the end so it survives restarts.

// src/client/lid_pn.rs
pub(crate) async fn migrate_signal_sessions_on_lid_discovery(&self, pn: &str, lid: &str) {
    for device_id in 0..=99u16 {
        // Read from signal_cache (authoritative over backend)
        // If PN session exists and no LID session → move session to LID via cache
        // If both exist → delete the stale PN session from cache
        // Identity keys are migrated independently of sessions
    }
    // Flush migrated state to backend so it survives restarts
    self.signal_cache.flush(backend.as_ref()).await;
}

Migration rules per device:

PN session	LID session	Action
Exists	Does not exist	Move session and identity from PN→LID address
Exists	Exists	Delete stale PN session (LID takes precedence)
Does not exist	Any	No action

Identity keys are migrated independently of sessions — they can outlive deleted sessions and survive session re-establishment.

The migration reads through the cache because the backend may contain stale session data when unflushed cache mutations exist. Reading directly from the backend could skip in-flight ratchet advances, causing the migrated session to decrypt with an outdated chain key.

On-the-fly migration during decryption

If a message arrives from a LID address and decryption fails with SessionNotFound or InvalidPreKeyId, the client attempts PN→LID migration as a fallback before requesting a retry:

Look up the PN for the sender’s LID
Attempt to migrate PN sessions to LID via the signal cache (same cache-first logic as proactive migration)
Retry decryption with the migrated session (already in the cache — no reload needed)
If DuplicateMessage occurs during post-migration retry, it is silently ignored
Fall back to retry receipt only if migration does not resolve the issue

The InvalidPreKeyId case occurs when a PreKeyMessage references a consumed one-time prekey, but the session actually exists under a PN address (legacy migration). Migrating the session lets Signal use the existing ratchet state instead of looking up the consumed prekey. This migration is attempted in both the identity-change retry path and the initial decryption path. This ensures existing databases are fixed without requiring re-pairing. At login, the client checks the session state of own device 0 (primary phone):

LID session exists — no action needed
PN session only — logged; migration deferred to first message via on-the-fly path
No session — will be established on first message exchange

// src/client/sessions.rs
pub(crate) async fn establish_primary_phone_session_immediate(&self) -> Result<()> {
    // Checks LID session → logs PN-only state → defers migration to message path
}

Both migration paths route through the SignalStoreCache, ensuring they see the latest in-memory state. The proactive migration runs when a LID mapping is first discovered and flushes to the backend afterward. The on-the-fly migration handles the case where the database already contains stale PN sessions from before the mapping was known.

Location: src/client/lid_pn.rs, src/client/sessions.rs, src/message.rs

Sender keys (group encryption)

Groups use the Sender Key protocol for efficient multi-recipient encryption.

Sender key address normalization

Sender key records are keyed by a composite SenderKeyName containing the group JID and a sender protocol address string. WhatsApp delivers group stanzas with inconsistent sender addressing — the pkmsg (which carries the SKDM) arrives with a device-qualified participant JID (e.g., 100000000000001.1:75@lid), while the skmsg (the actual encrypted group message) arrives with a bare participant JID (e.g., 100000000000001.1@lid). Without normalization, the sender key would be stored under the device-qualified address during SKDM processing but looked up under the bare address during skmsg decryption, causing NoSenderKeyState failures. The client normalizes the sender JID to its bare form using to_non_ad() (which strips the device component, setting device = 0, agent = 0) at every point where a SenderKeyName is constructed. The SenderKeyName::from_jid() convenience method handles the to_string() conversion automatically:

// Decryption path (src/message.rs) — normalize before group_decrypt
let sender_for_sk = info.source.sender.to_non_ad();
let sender_address = sender_for_sk.to_protocol_address();
let sender_key_name = SenderKeyName::from_jid(&info.source.chat, &sender_address);

// SKDM storage path (src/message.rs) — normalize before process_sender_key_distribution_message
let sender_bare = sender_jid.to_non_ad();
let sender_address = sender_bare.to_protocol_address();
let sender_key_name = SenderKeyName::from_jid(&group_jid, &sender_address);

SenderKeyName::from_jid() is equivalent to SenderKeyName::new(group_jid.to_string(), sender_address.to_string()) but avoids the manual to_string() calls and is the preferred constructor. This ensures the cache key is always in the form "{group}:{bare_user}@{server}.0", regardless of whether the original stanza used a device-qualified or bare JID.

Custom implementations that construct SenderKeyName directly must also normalize the sender JID to its bare form. Failing to do so will cause sender key lookup mismatches and decryption failures for group messages.

Location: src/message.rs, wacore/libsignal/src/store/sender_key_name.rs, wacore/binary/src/jid.rs (to_non_ad())

Sender key distribution

Each participant generates and distributes a sender key:

// From wacore/libsignal/src/protocol/group_cipher.rs:283-336
pub async fn create_sender_key_distribution_message<R: Rng + CryptoRng>(
    sender_key_name: &SenderKeyName,
    sender_key_store: &mut dyn SenderKeyStore,
    csprng: &mut R,
) -> Result<SenderKeyDistributionMessage>

Structure:

Chain ID: Random 31-bit identifier for this sender key session
Iteration: Message counter (starts at 0)
Chain Key: 32-byte seed for deriving message keys
Signing Key: Ed25519 public key for message authentication

Group Encryption

Messages are encrypted with the sender’s current chain key:

// From wacore/libsignal/src/protocol/group_cipher.rs:53-116
pub async fn group_encrypt<R: Rng + CryptoRng>(
    sender_key_store: &mut dyn SenderKeyStore,
    sender_key_name: &SenderKeyName,
    plaintext: &[u8],
    csprng: &mut R,
) -> Result<SenderKeyMessage>

Process:

Load sender key state for the group
Derive message keys from current chain key
Encrypt with AES-256-CBC
Sign message with Ed25519 private key
Advance chain key

Group Decryption

Recipients decrypt using the sender’s distributed key:

// From wacore/libsignal/src/protocol/group_cipher.rs:162-250
pub async fn group_decrypt(
    skm_bytes: &[u8],
    sender_key_store: &dyn SenderKeyStore,
    sender_key_name: &SenderKeyName,
) -> Result<Vec<u8>>

Process:

Parse SenderKeyMessage
Look up sender key state by chain ID
Verify Ed25519 signature
Derive message keys for iteration (handling out-of-order)
Decrypt with AES-256-CBC

Group decryption maintains up to MAX_FORWARD_JUMPS (25000) cached message keys per sender. This prevents resource exhaustion attacks but limits tolerance for extreme out-of-order delivery.

Unknown device detection

During group message decryption, the client checks whether the sender’s device is present in the local device registry via is_from_known_device(). This detection triggers in two places within the group message processing path:

After successful skmsg decrypt — if the sender device is not in the registry, the decrypted message is still processed and delivered normally. Signal decryption success already proves the sender holds a valid session key, so discarding the message would only add latency via an unnecessary retry round-trip. A background device sync is triggered to update the local device registry.
After a NoSenderKeyState error — if the sender device is unknown, the retry reason is upgraded from NoSession to UnknownCompanionNoPrekey

In both cases, the client queues a device list synchronization for the sender’s user JID. The behavior depends on the connection state:

Online: the client immediately invalidates the cached device registry for the user and fires a background usync request to refresh the device list
Offline (during offline sync): the unknown device’s user JID is batched into a PendingDeviceSync set, which is flushed after offline sync completes (see Deferred device sync)

Primary devices (device ID 0) are always treated as known — the check only applies to companion devices. This mechanism ensures that group messages from newly-paired companion devices are delivered immediately without waiting for a retry round-trip. The background device sync updates the local registry so future messages from the same device are recognized directly.

// src/message.rs — simplified flow
async fn handle_unknown_device_sync(&self, info: &Arc<MessageInfo>) {
    let user_jid = info.source.sender.to_non_ad();
    if !self.pending_device_sync.add(user_jid.clone()).await {
        return; // already queued, dedup
    }
    if info.is_offline {
        return; // batched for deferred flush
    }
    // Online: immediate sync
    self.invalidate_device_cache(&user_jid.user).await;
    self.get_user_devices(&[user_jid]).await.ok();
}

Location: src/message.rs, src/client/device_registry.rs, src/pending_device_sync.rs

Retry receipt from unknown group device

When the client receives a retry receipt, handle_retry_receipt checks whether the requesting device is present in the local device registry. Previously the handler dropped all retries from unregistered devices — this was safe for WA Web because WA Web keeps participant device lists fresh via a pre-send sync, so any legitimate requester is already known before the send. For a library client, a participant device can legitimately be absent from the local registry: if the device joined between the last device-list sync and the group send, it will have received the skmsg from the server but never obtained a sender key, causing it to retry indefinitely. The retry receipt may carry a <keys> bundle — the ADV-signed device-identity, the identity key, a one-time prekey, and the signed prekey — which is everything needed to establish a Signal session and resend. But a newly-linked device that has no bundle still retries forever if the client only drops it: the reconciliation that fires when a prekey fetch returns 406 never triggers for that device because it was never in the send set. Whenever a retry arrives from an unknown device, handle_retry_receipt now calls schedule_unknown_device_sync before consulting should_drop_unknown_device_retry. This treats the retry as a staleness signal: the requester’s user JID is enqueued for a device-list resync (deduplicated via PendingDeviceSync, so a retry storm from a single device cannot fan out into a usync storm). Once the resync completes, the device appears in the registry and future sends include it in the sender-key distribution — the retries stop. This mirrors WA Web’s syncDeviceListJob trigger on the retry path. The drop predicate still controls whether the current retry is recovered or dropped:

// wacore/src/protocol/retry.rs
pub fn should_drop_unknown_device_retry(keys_present: bool, device_known: bool) -> bool {
    !keys_present && !device_known
}

`keys_present`	`device_known`	Result
`true`	`false`	Recover — build a session from the embedded bundle and resend; resync also triggered
`false`	`false`	Drop — no bundle to recover this message; device-list resync triggered so device is learned for the next send
any	`true`	Resend — device is in registry, proceed normally

When the bundle includes a <device-identity>, process_retry_key_bundle validates the ADV chain against the requester’s account key (using the stored primary identity as a fallback when the server omits account_signature_key). A present-but-invalid ADV result is a hard error; the session is not built. If <device-identity> is absent from the bundle, or if no account key can be found, the check is skipped with a warning and the session is built anyway — matching the behaviour of the regular prekey-fetch path. The drop predicate only gates on syntactic <keys> presence, so the ADV guarantee is conditional on the bundle including a well-formed <device-identity>. This mirrors whatsmeow’s approach of building the prekey session directly from the receipt bundle without a device-registry gate. Location: src/retry.rs, wacore/src/protocol/retry.rs, src/pending_device_sync.rs

Immutable sender key loading

The SenderKeyStore trait’s load_sender_key method takes &self (not &mut self), allowing sender key lookups to proceed under a read lock. This is safe because loading a sender key is a pure read operation — no state is mutated. The store_sender_key method still requires &mut self since it modifies state. This means concurrent group decryptions for different senders can load sender keys in parallel without contention, while writes (SKDM processing) still serialize correctly.

If you implement SenderKeyStore for a custom backend, load_sender_key must use &self (immutable reference). Implementations that previously required &mut self for internal caching should use interior mutability (e.g., Mutex or RwLock) instead.

Sender key existence check

Before distributing sender keys, the group message path checks whether the local sender key already exists. This check uses the SignalStoreCache with a read lock (get_sender_key()), matching the status broadcast path. This avoids acquiring a write lock and prevents unnecessary SKDM re-distribution on every group send.

Per-device sender key tracking

To avoid resending Sender Key Distribution Messages on every group message, the client tracks sender key distribution status per device for each group. This uses a unified sender_key_devices table (see Storage - ProtocolStore) that matches WhatsApp Web’s participant.senderKey Map<deviceJid, boolean> model — a single boolean per device per group indicating whether that device has a valid sender key (true) or needs fresh SKDM distribution (false). The tracking update is deferred until after the server acknowledges the message stanza. This matches WhatsApp Web’s behavior where markHasSenderKey() is only called after the server confirms receipt. Why deferred? If the tracking were updated immediately after building the stanza (but before sending), a network failure between stanza build and send would leave stale entries — devices would be marked as having the sender key when they never actually received it. Subsequent messages would skip SKDM for those devices, causing decryption failures. PreparedGroupStanza return value: prepare_group_stanza returns a PreparedGroupStanza struct containing the stanza node and a skdm_devices: Vec<Jid> field listing exactly which devices received SKDM in this stanza. This eliminates the need for callers to re-resolve devices after sending, closing a race window where the device list could change between stanza preparation and post-ACK tracking update.

pub struct PreparedGroupStanza {
    pub node: Node,
    /// Devices that received SKDM in this stanza. Empty when no SKDM was distributed.
    pub skdm_devices: Vec<Jid>,
}

Implementation:

Group path: After send_node() succeeds, the caller uses the skdm_devices list from PreparedGroupStanza to call set_sender_key_status(group, devices, true). No re-resolution needed.
Status path: A late-init boolean tracks whether full distribution occurred. The sender key tracking is only updated after the status stanza is successfully sent.
Error recovery: If prepare_group_stanza fails with NoSenderKeyState, all sender key device tracking for that group is cleared and the send is retried with full distribution.
Sender key rotation: On rotateKey, the Signal sender key is also deleted for forward secrecy (matching WhatsApp Web’s deleteGroupSenderKeyInfo), and all device tracking is cleared via clear_sender_key_devices.

Incremental targeting: Rather than distributing the sender key to all group devices on every message, the client:

Loads the per-device sender key map — first checking the in-memory cache, falling back to the database via get_sender_key_devices
Resolves all current group participant devices
Computes the diff — only devices with has_key=false or not yet tracked receive the SKDM
Passes the targeted device list to prepare_group_stanza via the skdm_target_devices parameter

On the first group send (or any send where the cached map is empty), the filter still runs unconditionally — every resolved participant device is treated as has_key=false and receives the SKDM. This matches WhatsApp Web, which iterates an empty senderKey Map as false per participant. There is no early-exit for an empty cache; otherwise the very first message after a fresh start would skip distribution entirely.

Location: src/send.rs, src/client/sender_keys.rs, wacore/src/send.rs

Parallelized group encrypt fan-out

The group send path no longer serializes encryption behind a client-level lock. prepare_group_stanza and encrypt_for_devices now take an explicit &runtime handle (&*self.runtime) so per-device encryption can run on runtime::blocking() tasks concurrently. Combined with the move to update_device_lists (batched device-registry writes) and a no-lock IdentityAdapter::is_trusted_identity stub, group fan-out scales with the runtime’s worker count rather than with a single critical section. This is an internal performance change — no public method on Client::send_message was renamed, and the order of <to> children in the resulting stanza is unchanged. If you implemented a custom SignalStore, note that update_device_lists(records: Vec<DeviceListRecord>) is now part of the trait so the fan-out can batch its writes.

While per-device encryption runs concurrently, the sender-key chain is protected by two separate locks per (group, sender) pair:

Session-setup lock (SenderKeyStore::session_setup_lock) — held only across ensure_sessions_for_devices (prekey fetch + X3DH). May span network I/O. Warm sends (no SKDM needed) never take it, so they are never blocked by a cold send’s network round-trip.
Chain lock (SenderKeyStore::sender_key_lock) — held across SKDM creation + pairwise encrypt fan-out + skmsg encrypt. Pure CPU; never spans network I/O. This is the invariant that prevents two concurrent sends from splitting the key between the SKDM and the skmsg.

Prior to #807, a single chain lock covered both phases, causing concurrent group sends to serialize behind a server round-trip whenever a new session needed to be established. Now only the CPU phase is in the critical section. Different groups (or different senders) encrypt fully in parallel, unchanged.encrypt_for_devices is composed of two public halves: ensure_sessions_for_devices (network, returns SessionPlan) and encrypt_for_devices_with_sessions (CPU, consumes SessionPlan). The DM path calls encrypt_for_devices unchanged; the group path calls them separately with the chain lock taken only around the second.

In-memory sender key device cache

The SenderKeyDeviceCache provides an in-memory caching layer over the per-device sender key tracking data stored in the database. Without this cache, every group send would require a database round-trip to load the sender key device map — the cache eliminates that overhead after the first load for each group.

pub(crate) struct SenderKeyDeviceCache {
    inner: Cache<String, Arc<SenderKeyDeviceMap>>,
}

Key design decisions:

Time-to-idle eviction: The cache uses TTI semantics (default: 1 hour, 500 entries), so entries for inactive groups are automatically evicted while frequently-used groups stay cached
Pre-parsed, pre-indexed maps: Database rows are parsed into a SenderKeyDeviceMap struct that provides O(1) lookups by user and device ID, avoiding per-query string parsing
Single-flight initialization: The get_or_init method uses moka’s built-in coalescing — if multiple concurrent group sends for the same group trigger a cache miss simultaneously, only one database read executes and all callers share the result
Explicit invalidation: The cache is invalidated when sender key state changes (rotation, error recovery, retry failures) so stale data is never served

// Atomic get-or-init: concurrent callers for the same group
// share the single database read result
let cached_map = self
    .sender_key_device_cache
    .get_or_init(group_jid, async {
        let db_rows = pm.get_sender_key_devices(group_jid).await.unwrap_or_default();
        Arc::new(SenderKeyDeviceMap::from_db_rows(&db_rows))
    })
    .await;

SenderKeyDeviceMap structure: The SenderKeyDeviceMap pre-parses JID strings from the database into a user-to-devices HashMap for efficient lookup:

pub(crate) struct SenderKeyDeviceMap {
    /// user → (device_id → has_key)
    devices: HashMap<Arc<str>, HashMap<u16, bool>>,
    /// Users with at least one has_key=false device
    forgotten_users: HashSet<Arc<str>>,
}

Cache invalidation points:

Event	Action
Sender key rotation (`rotateKey`)	Invalidate group entry
`NoSenderKeyState` error during send	Invalidate group entry
Retry failure for a group message	Invalidate group entry
Server rejects group stanza	Invalidate group entry
New device added (`patch_device_add`)	Invalidate all entries
Device removed (`patch_device_remove`)	Invalidate all entries
Identity change (`clear_device_record`)	No global tracker wipe — per-device SKDM redistribution is driven by retry receipts (`markForgetSenderKey`), matching WhatsApp Web’s `WAWebUpdateLocalSignalSession`. The `status@broadcast` sender key is still deleted for forward secrecy.

You can tune the cache capacity and TTI via the sender_key_devices_cache field in CacheConfig. Location: src/sender_key_device_cache.rs, src/send.rs

Phash validation for stale device list detection

When sending group, status, or DM messages, the library validates the participant hash (phash) returned in the server’s acknowledgment against the locally computed phash. A mismatch indicates that the server’s view of participant devices differs from the client’s — meaning the local device list is stale. How it works:

Before sending, the client obtains the locally computed phash — from the stanza phash attribute for group/status messages, or from PreparedDmStanza.phash for DMs
A oneshot ack waiter is registered for the message ID via register_ack_waiter
The message stanza is sent to the server
A background task (spawn_phash_validation) awaits the server’s ack (with a 10-second timeout)
The server’s ack includes its own phash — if it differs from the local value, the client invalidates caches

On mismatch, the following caches are invalidated:

Send path	Sender key device cache	Group info cache	Device registry
Group messages	Invalidated	Invalidated	—
Status messages	Invalidated	Not invalidated	—
DM messages	—	—	Recipient + own PN devices invalidated

For DM messages, the phash covers both recipient and own devices (matching WA Web’s syncDeviceListJob([recipient, me])). On mismatch, the client invalidates the device registry cache for both the recipient’s user JID and your own phone number (PN) JID, ensuring the next send re-fetches the current device list for both parties.

// src/send.rs — simplified phash validation flow

// Group/status path: phash from stanza attribute
let our_phash = stanza.attrs().optional_string("phash").map(|s| s.into_owned());

// DM path: phash from PreparedDmStanza (not on the wire)
let dm_phash = prepared.phash;

// On DM phash mismatch:
if !jid.is_group() && !jid.is_status_broadcast() {
    client.invalidate_device_cache(&jid.user).await;
    if let Some(own_pn) = &client.persistence_manager.get_device_snapshot().pn {
        client.invalidate_device_cache(&own_pn.user).await;
    }
}

The phash validation runs asynchronously in the background and does not block the send path. If the server ack times out (after 10 seconds) or the oneshot channel is dropped, the validation is silently skipped. This matches WhatsApp Web’s approach of using phash as a best-effort staleness detector rather than a hard requirement.

WA Web phash parity (v0.6)

Two corrections aligned the group phash with WA Web’s phashV2:

Full device set, every send. The group phash is now computed over the complete resolved participant device set plus the sending device on every send — not just the devices that received an SKDM in that stanza. Warm sends (which distribute no new SKDM) now pass the full resolved set via the all_devices_for_phash parameter to prepare_group_stanza, so the phash matches the server’s view even when the SKDM target set is empty. Status broadcasts keep their prior phash behavior.
Standard base64 alphabet. The phash now encodes with the standard base64 alphabet (+ / /) instead of URL-safe (- / _), matching WA Web and whatsmeow.

The client also now persists group metadata locally after a query and sends the stored participant phash on the next group query, letting the server answer not-modified (304) when membership is unchanged — saving a full metadata round-trip. Location: src/send.rs, src/client.rs

Cryptographic Primitives

AES-256-CBC (message content)

Used for encrypting message bodies in both 1:1 and group messages:

pub fn aes_256_cbc_encrypt_into(
    plaintext: &[u8],
    key: &[u8],      // 32 bytes
    iv: &[u8],       // 16 bytes
    output: &mut Vec<u8>,
) -> Result<()>

Location: wacore/libsignal/src/crypto/aes_cbc.rs

Thread-Local Buffers

The implementation uses thread-local buffers to reduce allocations:

thread_local! {
    static ENCRYPTION_BUFFER: RefCell<EncryptionBuffer> = ...;
    static DECRYPTION_BUFFER: RefCell<EncryptionBuffer> = ...;
}

// Usage in session_cipher.rs:99-111
let ctext = ENCRYPTION_BUFFER.with(|buffer| {
    let mut buf_wrapper = buffer.borrow_mut();
    let buf = buf_wrapper.get_buffer();
    aes_256_cbc_encrypt_into(ptext, message_keys.cipher_key(), 
                            message_keys.iv(), buf)?;
    let result = std::mem::take(buf);
    buf.reserve(EncryptionBuffer::INITIAL_CAPACITY);
    Ok::<Vec<u8>, SignalProtocolError>(result)
})?;

Location: wacore/libsignal/src/protocol/session_cipher.rs:14-54

HKDF-SHA256

Used for key derivation in session initialization:

pub fn derive_keys(secret_input: &[u8]) -> (RootKey, ChainKey, InitialPQRKey) {
    let mut secrets = [0; 96];
    hkdf::Hkdf::<sha2::Sha256>::new(None, secret_input)
        .expand(b"WhisperText", &mut secrets)
        .expect("valid length");
    // Split into RootKey[32], ChainKey[32], PQRKey[32]
}

Location: wacore/libsignal/src/protocol/ratchet.rs:18-39

PreKey Management

Pre-keys enable asynchronous session establishment in the Signal Protocol. whatsapp-rust manages pre-key generation and upload to match WhatsApp Web’s behavior.

Configuration

The per-batch upload count is configurable through the builder/factory API (default 812, matching WhatsApp Web’s UPLOAD_KEYS_COUNT). The upload-trigger threshold is a private constant.

Setting	Default	Description
`BotBuilder::with_wanted_pre_key_count` / `Client::set_wanted_pre_key_count`	812	Number of pre-keys generated and uploaded per batch. Clamped to `5..=65_535` at upload time.
`MIN_PRE_KEY_COUNT` (private const)	5	Minimum server-side pre-key count before triggering an upload.

// Via the Bot builder
Bot::builder()
    .with_wanted_pre_key_count(256) // smaller batches for embedded consumers
    // ...

// Or directly on a Client constructed by hand (before connect)
client.set_wanted_pre_key_count(256);

Values outside [5, 65_535] are clamped at upload time (an out-of-range value logs a warn!). The floor avoids an empty-but-flagged pool or a re-upload loop. The ceiling is the wire-format limit: the upload IQ encodes the pre-key list length as a u16, so a larger batch would generate keys locally and then fail to encode. Per-key X25519 generation and prost encoding for the batch are offloaded via wacore::runtime::blocking (runtime-agnostic; runs inline on wasm) since the caller-controlled batch size can be large.

Pre-key ID counter and wrap-around

Pre-key IDs use a persistent monotonic counter (Device::next_pre_key_id) that only increases, matching WhatsApp Web’s NEXT_PK_ID pattern:

// Determine starting ID using both the persistent counter AND the store max
let max_id = backend.get_max_prekey_id().await?;
let start_id = if device_snapshot.next_pre_key_id > 0 {
    std::cmp::max(device_snapshot.next_pre_key_id, max_id + 1)
} else {
    // Migration: start from MAX(key_id) + 1
    max_id + 1
};

This approach prevents ID collisions when pre-keys are consumed non-sequentially from the store. 24-bit wrap-around: WhatsApp Web uses 24-bit pre-key IDs on the wire (3-byte big-endian), so valid IDs range from 1 to 16,777,215 (2^24 − 1). When the persistent counter grows past this boundary, modular arithmetic wraps IDs back into the valid range:

const MAX_PREKEY_ID: u32 = 16777215; // 2^24 - 1

// Wrap start ID into valid [1, MAX_PREKEY_ID] range
let start_id = ((raw_start as u64 - 1) % MAX_PREKEY_ID as u64) as u32 + 1;

// Each key ID in the batch is also wrapped
let pre_key_id = (((start_id as u64 - 1) + i as u64) % (MAX_PREKEY_ID as u64)) as u32 + 1;

// After upload, the persisted next_pre_key_id wraps too
let next_id = (((start_id as u64 - 1) + key_pairs_to_upload.len() as u64)
    % (MAX_PREKEY_ID as u64)) as u32 + 1;

If the counter wraps while unconsumed high-ID pre-keys still exist in the store, the database upsert (ON CONFLICT DO UPDATE) silently overwrites them. This is an accepted trade-off because the server consumes keys well before a full 16M cycle completes.

Location: src/prekeys.rs

Force-refreshing pre-keys for device migration

When migrating a device from an external source (e.g., a Baileys session into an InMemoryBackend), the server may still hold pre-key IDs whose private key material you cannot reconstruct. Any pkmsg referencing those IDs will fail permanently with InvalidPreKeyId. The public refresh_pre_keys() method force-uploads a fresh batch of Client::wanted_pre_key_count() pre-keys (default 812; tunable via with_wanted_pre_key_count / set_wanted_pre_key_count), giving the server new IDs the caller has locally. Old unmatched IDs drain naturally as peers consume them.

// After restoring a session from another library
client.refresh_pre_keys().await?;

Internally, this acquires prekey_upload_lock to prevent races with the count-based and digest-repair upload paths, then calls upload_pre_keys_with_retry(force: true) which uses Fibonacci backoff (1s, 2s, 3s, 5s, 8s, … capped at 610s). Location: src/prekeys.rs:263-266

Digest key validation

After connection, the client validates that the server’s copy of the key bundle matches local keys. This matches WhatsApp Web’s WAWebDigestKeyJob.digestKey() flow. Wire format:

<!-- Request -->
<iq xmlns="encrypt" type="get" to="s.whatsapp.net" id="...">
  <digest/>
</iq>

<!-- Response -->
<iq from="s.whatsapp.net" id="..." type="result">
  <digest>
    <registration>[4-byte BE registration ID]</registration>
    <type>[1-byte: 5]</type>
    <identity>[32-byte identity public key]</identity>
    <skey>
      <id>[3-byte BE signed pre-key ID]</id>
      <value>[32-byte signed pre-key public]</value>
      <signature>[64-byte signature]</signature>
    </skey>
    <list>
      <id>[3-byte BE prekey ID]</id>
      ...
    </list>
    <hash>[20-byte SHA-1 hash]</hash>
  </digest>
</iq>

Validation process:

Query the server for the key bundle digest via DigestKeyBundleSpec
If the server returns 404 (no record), trigger a full pre-key re-upload
If the server returns 406/503 or other errors, log and skip
On success, compare registration IDs
Load each pre-key referenced by the server and extract its public key
Compute a local SHA-1 digest over: identity public key + signed pre-key public + signed pre-key signature + all pre-key public keys
Compare the local hash against the server-provided hash

The <list> node contains <id> children (not <key> children). The parser iterates all children of <list> without tag filtering, matching WhatsApp Web’s mapChildren behavior which does not filter by tag name.

Hash mismatches or missing local pre-keys are logged but do not trigger a re-upload. Only a 404 response (server has no record) triggers re-upload. This matches WhatsApp Web’s behavior where validateLocalKeyBundle exceptions are caught without re-uploading — the normal RotateKeyJob eventually refreshes keys.

Location: src/prekeys.rs:218-344, wacore/src/iq/prekeys.rs:170-302

Re-pair pre-key healing (v0.6)

If the user re-pairs the device (for example by re-scanning the QR code), the server discards its copy of our pre-key bundle even though Device::server_has_prekeys may still read true from the previous pairing. v0.6 resets server_has_prekeys = false immediately after a successful re-pair so the next connect uploads a fresh batch instead of trusting the stale flag. The lock-acquisition for the digest-key validator also moved into validate_digest_key itself. Previously the caller held prekey_upload_lock before calling the validator, which would deadlock when validation hit a 404 and tried to acquire the same lock to perform the re-upload. The lock now wraps only the re-upload path, so the 404→re-upload transition completes without contention. Location: src/handlers/notification.rs, src/pair.rs, src/prekeys.rs

ADV companion identity validation

When fetching a pre-key bundle for a contact’s companion device (WhatsApp Web / Desktop), the bundle’s <device-identity> element is validated to confirm that the fetched identity key is cryptographically bound to the account. This guards against a relay substituting a forged identity key, matching WA Web’s SessionApi.createSignalSession. Account key resolution mirrors WA Web’s validateADVwithIdentityKey (e.accountSignatureKey || t):

In-blob key: If ADVSignedDeviceIdentity.account_signature_key is present and non-empty, it is used directly.
Stored identity fallback: The server legitimately omits this field for a contact’s companion because the client already holds the contact’s primary (device 0) identity in the Signal identity store. When the field is absent, Client::load_account_identity loads it — reading through the SignalStoreCache so any unflushed mutations from the current session are visible. PreKeyFetchSpec::with_account_identities threads the pre-loaded map into wacore’s stateless prekey parser, keeping store access in the whatsapp-rust crate.

Validation results (wacore::adv::AdvValidation):

Variant	Condition	Action
`Valid`	Both account and device signatures verified	Session is established normally
`Invalid`	Blob is malformed, or signatures fail against the available key	Bundle is rejected — a relay swapping in a forged identity lands here
`NoAccountKey`	Neither the blob nor the store has the key	Bundle is kept, ADV check skipped (logged as `warn!`)

NoAccountKey does not weaken security beyond the pre-existing “device-identity absent” path: a relay could already strip the entire <device-identity> element to bypass the check. It exists so brand-new contacts whose primary identity has never been seen are not silently dropped. The same three-state validation applies in the retry-receipt handler (src/retry.rs) when a companion device requests a re-send. Location: wacore/src/adv.rs, wacore/src/iq/prekeys.rs, src/prekeys.rs

Storage Integration

whatsapp-rust integrates Signal Protocol storage through a layered architecture:

src/store/
├── signal.rs               # SignalStore trait impl for Device (identity, session, prekey, sender key)
├── signal_adapter.rs       # SignalProtocolStoreAdapter — cache-backed adapter bridging wacore traits to libsignal traits
└── signal_cache.rs         # Re-export of wacore::store::signal_cache::SignalStoreCache

The Device struct implements the libsignal SessionStore, IdentityKeyStore, and other traits. These are wrapped by SignalProtocolStoreAdapter, which adds the SignalStoreCache layer — sessions are cached as SessionRecord objects (not bytes), with serialization deferred to flush(). Each store (sessions, identities, sender keys) is flushed independently under its own lock. Only one store is locked during its I/O — the other two remain free for concurrent encrypt/decrypt operations. The lock is held from snapshot through write through clear, so mutations to the same store are blocked until flush completes, preventing dirty-set races:

// SignalProtocolStoreAdapter reads/writes through the cache
#[async_trait]
impl SessionStore for SessionAdapter {
    async fn load_session(
        &self,
        address: &ProtocolAddress,
    ) -> Result<Option<SessionRecord>, SignalProtocolError> {
        // Returns cached SessionRecord object directly (no deserialization)
        // Cold load deserializes from backend bytes once and caches the object
        self.cache.get_session(&addr_str, &*device.backend).await
    }

    async fn store_session(
        &mut self,
        address: &ProtocolAddress,
        record: SessionRecord,  // Takes ownership — zero-cost move
    ) -> Result<(), SignalProtocolError> {
        // Stores the SessionRecord object in cache, marks dirty
        // Serialization happens only during flush()
        self.cache.put_session(&addr_str, record).await;
        Ok(())
    }
}

Security Considerations

Identity key trust

The implementation verifies identity keys before encryption/decryption:

if !identity_store
    .is_trusted_identity(remote_address, &their_identity_key, 
                         Direction::Sending)
    .await?
{
    return Err(SignalProtocolError::UntrustedIdentity(
        remote_address.clone(),
    ));
}

Location: wacore/libsignal/src/protocol/session_cipher.rs:160-172

Self-only protocol message gating

app_state_sync_key_share and history_sync_notification are protocol messages WhatsApp Web treats as “self-only”: they only carry meaning when delivered from your own account to another of your linked devices. v0.6 hardens handle_decrypted_plaintext so that incoming copies of these two messages are dropped unless MessageInfo.source.is_from_me is true, matching WA Web’s WAWebKeyManagementHandleKeyShareApi and whatsmeow’s gating. The consequences if the gate is missing:

A spoofed app_state_sync_key_share from a peer would let an attacker inject an app-state encryption key, leading to attacker-controlled mutations of your contacts, blocklist, archive state, etc.
A spoofed history_sync_notification would point the client at attacker-supplied media for ingestion as your own history.

If you implement a custom message dispatcher, replicate this is_from_me check before honoring either protocol message. Other protocol-message types (REVOKE, EPHEMERAL_SETTING, MESSAGE_EDIT, …) keep their existing semantics. Location: src/message.rs (handle_decrypted_plaintext)

Duplicate message detection

The protocol detects and rejects duplicate messages:

if chain_index > counter {
    return match state.get_message_keys(their_ephemeral, counter)? {
        Some(keys) => Ok(keys),
        None => Err(SignalProtocolError::DuplicateMessage(chain_index, counter)),
    };
}

Location: wacore/libsignal/src/protocol/session_cipher.rs:822-827

Log level discipline

The protocol layer follows strict rules about what cryptographic material appears in logs and at which level:

No private keys or secrets are ever logged — ChainKey, MessageKeys, and RootKey types do not expose their key bytes through logging
Public keys appear only at warn/error levels — and only when something has gone wrong (untrusted identity, MAC failure)
MAC key fingerprints are truncated — only the first 4 bytes (8 hex chars) are logged during MAC verification failures, not the full key:
```
let mac_key_fingerprint: String = hex::encode(mac_key_bytes).chars().take(8).collect();
```
Ratchet keys in debug logs — successful decryptions log the sender ratchet public key (never private) at debug level for diagnostics
Pre-key operations use debug for routine operations and warn/info for exceptional conditions

The Signal protocol layer (wacore/libsignal/src/protocol/) uses no trace!-level logging. Sensitive operations stay at debug or above to avoid leaking material in verbose log configurations.

Session state corruption

Detailed logging helps diagnose crypto failures:

fn create_decryption_failure_log(
    remote_address: &ProtocolAddress,
    errs: &[SignalProtocolError],
    record: &SessionRecord,
    ciphertext: &SignalMessage,
) -> Result<String>

This generates comprehensive error logs showing:

All attempted session states
Receiver chain information
Message metadata (sender ratchet key, counter)

Location: wacore/libsignal/src/protocol/session_cipher.rs:365-454

Protocol safety limits

The implementation enforces several hard limits to prevent resource exhaustion and cryptographic failures:

Constant	Value	Purpose
`MAX_PREKEY_ID`	16,777,215 (2^24 − 1)	Maximum valid pre-key ID (24-bit wire format)
`MAX_FORWARD_JUMPS`	25,000	Maximum message skip in a ratchet chain
`MAX_MESSAGE_KEYS`	2,000	Maximum cached out-of-order message keys per chain
`MAX_RECEIVER_CHAINS`	5	Maximum receiver chains per session
`ARCHIVED_STATES_MAX_LENGTH`	40	Maximum archived session states
`MAX_SENDER_KEY_STATES`	5	Maximum sender key states per group
`MESSAGE_KEY_PRUNE_THRESHOLD`	50	Amortized eviction trigger for old message keys
Chain key index	u32::MAX	Overflow returns `InvalidState` error (not silent wrap)

Location: wacore/libsignal/src/protocol/consts.rs

Self-DM / sibling decryption recovery

When a message from your own primary phone or another linked companion fails to decrypt, the v0.6 client distinguishes the failure mode and applies the matching recovery strategy:

Internal `RetryReason`	Trigger	Recovery
`NoSession`	`SessionNotFound` (no Signal session yet for the device)	Request a fresh prekey bundle via retry receipt; install the new session before retrying decryption.
`BadMac`	Ratchet desync (`InvalidMessage` / mac failure) on an existing session	Mark the session for re-creation, throttled per peer via the `session_recreate_history` cache so repeated BadMacs don’t loop, and re-send via a peer-addressed `pkmsg` carrying our identity.

The throttle is a per-peer cooldown (1-hour TTL after the last recreate). In v0.6 the implementation moved from a Mutex<HashMap<Jid, Instant>> to a bounded TTL cache (moka, ~256 entries): the per-peer check-and-stamp is now atomic (serialized by the existing per-peer session lock) and lock-free at the map level, so concurrent retry-receipt spawns from the same peer can’t trigger duplicate recreates. The behavior is unchanged — if a peer is already in cooldown, the client skips re-creation and falls back to a normal retry receipt rather than thrashing the session. Under more than ~256 distinct peers retrying within the window, the cache may evict a recent entry, costing at most one extra recreate (bounded and self-healing). Peer-addressed pkmsg carries the protocol identity so the receiver can verify ownership before installing the new session, blocking spoofed sibling recoveries. This closed a deadlock where self-DM fan-out to a sibling device produced repeated BadMac decrypt failures: the recipient would request a retry, the sender would re-encrypt against the same broken session, and the cycle would continue until the user manually relogged. With the throttled re-creation plus identity-validated pkmsg, the second receipt installs a fresh session and decryption resumes. Self-DM fan-out also gained WA Web parity for the BadMac case: when our own primary phone reports BadMac, the client now treats it as a session-level recovery rather than dropping the message, matching WAWebDecryptOrThrow’s branch on session divergence. Location: src/client.rs, src/retry.rs, wacore/libsignal/src/protocol/session_cipher.rs, wacore/src/send.rs

Performance optimizations

Session object cache

The SignalStoreCache stores sessions and sender keys as deserialized objects (SessionRecord and SenderKeyRecord) rather than serialized bytes, matching WhatsApp Web’s architecture where the JS object IS the cache. Serialization only happens during flush() to the database — not on every store_session or put_sender_key call.

// wacore/src/store/signal_cache.rs
enum SessionEntry {
    /// Arc so peek_session (retry / LID-migration checks) bumps a refcount
    /// instead of deep-cloning the record (KBs with archived states).
    Present(Arc<SessionRecord>),
    Absent,
    /// Taken by load_session; has_session treats as present, flush/eviction skip.
    CheckedOut,
}

struct SessionStoreState {
    cache: HashMap<Arc<str>, SessionEntry>,  // Objects, not bytes
    dirty: HashSet<Arc<str>>,
    deleted: HashSet<Arc<str>>,
}

// Sender keys use the same object-caching pattern
struct SenderKeyStoreState {
    cache: HashMap<Arc<str>, Option<SenderKeyRecord>>,  // Objects, not bytes
    dirty: HashSet<Arc<str>>,
}

This eliminates prost encode (on store) and decode (on load) from the per-message hot path for both 1:1 and group messages. The store_session method takes SessionRecord by value, enabling zero-cost moves from the protocol layer:

// wacore/libsignal/src/protocol/storage/traits.rs
pub trait SessionStore {
    async fn store_session(
        &mut self,
        address: &ProtocolAddress,
        record: SessionRecord,  // Owned — zero-cost move, no clone
    ) -> Result<()>;
}

All four protocol-layer call sites (message_encrypt, message_decrypt_signal, message_decrypt_prekey, process_prekey_bundle) drop the record immediately after storing. Taking ownership eliminates the .clone() in the adapter and the compiler enforces no use-after-store. Per-message hot path impact:

Operation	Before	After
`store_session`	clone all fields + prost encode	move (zero-cost)
`load_session`	prost decode + construct	clone current session only (`previous_sessions` O(1) via Arc)
`peek_session`	deep-clone record (1–2 KB)	`Arc` refcount bump; returns `Option<Arc<SessionRecord>>`
`store_sender_key`	serialize to bytes + store bytes	store `SenderKeyRecord` object directly
`load_sender_key` (`&self`)	load bytes + deserialize	return cached `SenderKeyRecord` object (read lock only)
`flush` (batched)	write bytes to DB	serialize sessions + sender keys + write bytes to DB

Arc previous sessions

SessionRecord.previous_sessions is wrapped in Arc<Vec<SessionStructure>>, making clone O(1) for the ~40 archived previous sessions that previously accounted for ~40% of the serialize cost:

// wacore/libsignal/src/protocol/state/session.rs
pub struct SessionRecord {
    current_session: Option<SessionState>,
    /// Wrapped in Arc so cloning is O(1). Only mutated on rare paths via Arc::make_mut.
    previous_sessions: Arc<Vec<SessionStructure>>,
}

Only rare operations (archive current session, promote previous session, take/restore during session setup) trigger Arc::make_mut and a deep copy.

Redundant signal store write elimination

The SignalStoreCache uses targeted deduplication strategies per store type. For identities (which rarely change), put_dedup() compares incoming bytes against the cached value and skips if identical:

// wacore/src/store/signal_cache.rs — ByteStoreState
fn put_dedup(&mut self, address: &str, data: &[u8]) {
    if let Some(Some(existing)) = self.cache.get(address)
        && existing.as_ref() == data
    {
        return; // Skip — data unchanged, no dirty mark
    }
    self.put(address, data);
}

Sessions and sender keys use unconditional put() since they change with every message — dedup would always fail and waste CPU cycles. This split avoids unnecessary database writes during flush() while not adding overhead where it provides no benefit.

Key reuse in cache

The key_for() method on SessionStoreState, SenderKeyStoreState, and ByteStoreState reuses existing Arc<str> keys from the HashMap via get_key_value(), avoiding a heap allocation on every cache operation:

fn key_for(&self, address: &str) -> Arc<str> {
    match self.cache.get_key_value(address) {
        Some((existing, _)) => existing.clone(),  // O(1) refcount bump
        None => Arc::from(address),                // Only on first insert
    }
}

On the hot path (put/delete for addresses already in the cache), this is always a refcount bump instead of a heap allocation.

Single-allocation session lock keys

Session lock keys use the full Signal protocol address string (e.g., 5511999887766@c.us.0). The JidExt trait provides methods for generating these strings, defined in wacore/src/types/jid.rs:

pub trait JidExt {
    /// Construct a fresh ProtocolAddress for this JID.
    fn to_protocol_address(&self) -> ProtocolAddress;

    /// Signal address string: `{user}[:device]@{server}`
    /// Device part only included when device != 0.
    fn to_signal_address_string(&self) -> String;

    /// Full protocol address string: `{signal_address_string}.0`
    /// Equivalent to `to_protocol_address().to_string()` but avoids the
    /// intermediate ProtocolAddress allocation — one String instead of two.
    fn to_protocol_address_string(&self) -> String;

    /// Rewrite a reusable ProtocolAddress in place for this JID.
    /// See [Reusable hot-loop address construction](#reusable-hot-loop-address-construction).
    fn reset_protocol_address(&self, addr: &mut ProtocolAddress);
}

to_protocol_address_string() is used on hot paths (message encryption and decryption) as the key for session_locks. It pre-sizes the output buffer and builds the string in a single allocation, avoiding the two-allocation overhead of constructing a ProtocolAddress and then calling .to_string(). The write_protocol_address_to() free function provides the same formatting but writes into a caller-supplied &mut String buffer, enabling buffer reuse across multiple JIDs (used by session_mutexes_for()). Format examples:

JID	Signal address	Protocol address string
`5511999887766@s.whatsapp.net`	`5511999887766@c.us`	`5511999887766@c.us.0`
`5511999887766:33@s.whatsapp.net`	`5511999887766:33@c.us`	`5511999887766:33@c.us.0`
`123456789@lid`	`123456789@lid`	`123456789@lid.0`
`123456789:33@lid`	`123456789:33@lid`	`123456789:33@lid.0`

The server s.whatsapp.net is mapped to c.us in address strings, matching WhatsApp Web’s internal format. The trailing .0 is the Signal device_id (always 0 in WhatsApp’s usage).

Usage in message processing:

// In message decryption (src/message.rs) — single lock per sender device
let signal_addr_str = sender_encryption_jid.to_protocol_address_string();
let session_mutex = self.session_locks
    .get_with(signal_addr_str.clone(), async {
        Arc::new(async_lock::Mutex::new(()))
    }).await;
let _session_guard = session_mutex.lock().await;

// In peer message encryption (src/send.rs) — single lock
let signal_addr_str = encryption_jid.to_protocol_address_string();

// In DM message encryption (src/send.rs) — per-device locks for all devices
let lock_jids = self.build_session_lock_keys(&all_dm_devices).await;
let session_mutexes = self.session_mutexes_for(&lock_jids).await;
// Guards acquired in sorted order to prevent deadlocks

DM multi-device fanout: The DM send path resolves all known recipient devices and own companion devices, encrypting per-device for each. This matches WA Web’s WAWebSendUserMsgJob behavior where the local device table is read on the send path, and WAWebDBDeviceListFanout filters out hosted devices. The client checks the local device registry first (via get_devices_from_registry()); a network fetch is only triggered on a cache miss to avoid unnecessary LID-migration side effects from get_user_devices. The sender device is excluded (matching WA Web’s isMeDevice in getFanOutList), and for self-DMs, overlapping device lists are deduplicated using a HashSet (matching WA Web’s Map keyed by toString). Own-device namespace alignment (v0.6): When the recipient is addressed in the LID namespace (@lid), the client converts its own companion devices from the PN namespace to LID before fanning out. Without this alignment, a <to> mix of @lid and @s.whatsapp.net participants caused the server to reject the stanza for LID-addressed DMs. Outgoing messages to PN-addressed recipients are unaffected. Own companion devices (your other linked devices) receive per-device encryption for multi-device self-sync via DeviceSentMessage.

WA Web has a bare-<enc> fast path for single primary device (WAWebSendMsgCreateFanoutStanza). This is not implemented in whatsapp-rust because encrypt_for_devices always wraps in <to jid=...> nodes. The <participants> form is accepted by the server regardless.

Fail-fast on total encrypt failure (v0.6). If per-device encryption fails for every recipient device, the DM send now returns an error instead of emitting a stanza with an empty participant list (which the server would silently swallow, making the message look sent when it wasn’t). A partial failure — some devices encrypt, some don’t — still sends to the devices that succeeded.

DM per-device locking: To prevent ratchet desync when concurrent sends and receives operate on the same Signal session, the DM path acquires session locks for all devices involved — the bare recipient plus own companion devices. The build_session_lock_keys() helper resolves encryption JIDs and sorts them for deadlock-free lock acquisition:

Resolves the recipient to its bare encryption JID via resolve_encryption_jid().to_non_ad() (stripping device component)
Resolves own companion device JIDs
Sorts by (server, user, device) using cmp_for_lock_order() and deduplicates
Returns sorted Vec<Jid> — no intermediate String allocations needed for sorting

The session_mutexes_for() helper then converts sorted JIDs to session mutexes, reusing a single String buffer via write_protocol_address_to() to avoid per-JID heap allocations:

// In DM message encryption (src/send.rs)
let lock_jids = self.build_session_lock_keys(&all_dm_devices).await;
let session_mutexes = self.session_mutexes_for(&lock_jids).await;
// Guards acquired in sorted order to prevent deadlocks

The recipient lock key is always the bare form (e.g., 100000012345678@lid.0), matching the decrypt path’s lock format. This ensures send and receive paths serialize on the exact same lock key. Location: wacore/src/types/jid.rs:4-51, src/send.rs:1481-1507

Single-buffer ProtocolAddress

The ProtocolAddress struct stores the full address string "{name}.{device_id}" in a single String buffer, with a name_len marker to split name from suffix. This halves the allocation count compared to storing name and device ID separately, and eliminates the copy when rewriting the address via reset_with().

// wacore/libsignal/src/core/address.rs
pub struct ProtocolAddress {
    buf: String,       // "{name}.{device_id}" in one buffer
    name_len: usize,   // marks where the name ends
    device_id: DeviceId,
}

impl ProtocolAddress {
    /// One-shot construction: takes ownership of `name` and appends the suffix.
    pub fn new(name: String, device_id: DeviceId) -> Self;

    /// Pre-allocated empty address for hot-loop reuse. Call `reset_with()` to fill.
    pub fn with_capacity(capacity: usize, device_id: DeviceId) -> Self;

    /// Rewrite the address in place via closure. Single write pass — no intermediate copy.
    pub fn reset_with(&mut self, write_name: impl FnOnce(&mut String));

    /// Zero-cost slice of the name portion.
    pub fn name(&self) -> &str;

    /// Zero-cost slice of the full buffer ("{name}.{device_id}").
    pub fn as_str(&self) -> &str;
}

Both name() and as_str() are zero-cost slices into the same buffer — no allocations on access.

Reusable hot-loop address construction

When iterating over many devices (e.g., during group stanza preparation or session resolution), allocating a fresh ProtocolAddress per device is wasteful. The JidExt trait provides reset_protocol_address() to rewrite a pre-allocated address in place, and make_reusable_protocol_address() creates the initial buffer:

// wacore/src/types/jid.rs
pub fn make_reusable_protocol_address() -> ProtocolAddress {
    ProtocolAddress::with_capacity(64, SIGNAL_DEVICE_ID)
}

pub trait JidExt {
    /// Rewrite a reusable ProtocolAddress in place for this JID.
    fn reset_protocol_address(&self, addr: &mut ProtocolAddress);
}

Usage in group stanza preparation:

// wacore/src/send.rs — session resolution loop
let mut reusable_addr = make_reusable_protocol_address();

for device_jid in devices {
    // Rewrite the same buffer — no new allocation per device
    device_jid.reset_protocol_address(&mut reusable_addr);

    if stores.session_store.load_session(&reusable_addr).await?.is_some() {
        // Session exists — use it
    }
}

This pattern eliminates one String allocation per device in the loop. For a group with 100 participant devices, that saves 100 heap allocations on the send path. The pre-allocated capacity of 64 bytes covers all known WhatsApp address formats without reallocation.

Use to_protocol_address() for one-shot address construction (e.g., cache keys, single lookups). Use make_reusable_protocol_address() + reset_protocol_address() when iterating over multiple JIDs in a tight loop.

Location: wacore/libsignal/src/core/address.rs, wacore/src/types/jid.rs, wacore/src/send.rs

Zero-Allocation JID Deduplication

Group stanza preparation needs to deduplicate participant JIDs at two stages: before device resolution (by user identity) and after LID conversion (by device identity). Two utility functions in wacore/src/types/jid.rs handle this with in-place sorted dedup instead of HashSet allocations:

/// Sort and deduplicate by user identity (user + server).
pub fn sort_dedup_by_user(jids: &mut Vec<Jid>);

/// Sort and deduplicate by device identity (user + server + agent + device).
pub fn sort_dedup_by_device(jids: &mut Vec<Jid>);

Both use sort_unstable_by followed by dedup_by, comparing JID fields directly without allocating intermediate strings or hash sets. This is more efficient than the HashSet<(String, String)> approach because:

No per-JID String::clone() for hash keys
No HashSet allocation or hashing overhead
Stable dedup order (sorted) instead of hash-dependent iteration

Usage in group sends (wacore/src/send.rs):

// Before device resolution — dedup participants by user identity
sort_dedup_by_user(&mut jids_to_resolve);

// After LID conversion — dedup devices by full device identity
// Catches duplicates where both phone and LID queries resolve
// to the same device (e.g., 559980000003:33 and 100000037037034:33@lid)
sort_dedup_by_device(&mut resolved_list);

Location: wacore/src/types/jid.rs:33-51

Take/Restore Pattern

Avoids cloning session states during decryption attempts:

// Take ownership instead of cloning
if let Some(mut current_state) = record.take_session_state() {
    let result = decrypt_message_with_state(&mut current_state, ...);
    match result {
        Ok(ptext) => {
            record.set_session_state(current_state);
            return Ok(ptext);
        }
        Err(e) => {
            record.set_session_state(current_state);  // Restore
        }
    }
}

Location: wacore/libsignal/src/protocol/session_cipher.rs:495-564

Buffer Reuse

Thread-local buffers eliminate per-message allocations:

struct EncryptionBuffer {
    buffer: Vec<u8>,
    usage_count: usize,
}

const INITIAL_CAPACITY: usize = 1024;
const MAX_CAPACITY: usize = 16 * 1024;
const SHRINK_THRESHOLD: usize = 100;

fn get_buffer(&mut self) -> &mut Vec<u8> {
    self.usage_count += 1;
    if self.usage_count.is_multiple_of(SHRINK_THRESHOLD) {
        if self.buffer.capacity() > MAX_CAPACITY {
            self.buffer = Vec::with_capacity(INITIAL_CAPACITY);
        }
    }
    &mut self.buffer
}

Location: wacore/libsignal/src/protocol/session_cipher.rs:20-54

Public API

The client.signal() accessor exposes low-level Signal protocol operations for direct use. This includes 1:1 and group encryption/decryption, session validation, session deletion, participant node creation, and device resolution. See Signal API reference for full method documentation and examples.

Binary Protocol - How encrypted messages are serialized
State Management - How session state is persisted
WebSocket Handling - Transport layer for encrypted messages
Signal API - Public API for Signal protocol operations

References

Signal Protocol Specification
libsignal Repository
Source: wacore/libsignal/src/protocol/
Storage: src/store/signal.rs, src/store/signal_adapter.rs, wacore/src/store/signal_cache.rs

Custom Backends

WhatsApp Binary Protocol

​Overview

​Architecture

​Key Components

​Double ratchet protocol

​Session Initialization

​Message Encryption

​Message Decryption

​Chain key ratcheting

​Chain key overflow protection

​Forward Jumps

​DM device fanout

​Device resolution

​Device partitioning

​Sender device exclusion

​Own device recognition

​PreparedDmStanza

​PN→LID session migration

​Signal address resolution

​Why migration is needed

​Proactive migration at LID discovery

​On-the-fly migration during decryption

​Login-time session check

​Sender keys (group encryption)

​Sender key address normalization

​Sender key distribution

​Group Encryption

​Group Decryption

​Unknown device detection

​Retry receipt from unknown group device

​Immutable sender key loading

​Sender key existence check

​Per-device sender key tracking

​Parallelized group encrypt fan-out

​In-memory sender key device cache

​Phash validation for stale device list detection

​WA Web phash parity (v0.6)

​Cryptographic Primitives

​AES-256-CBC (message content)

​Thread-Local Buffers

​HKDF-SHA256

​PreKey Management

​Configuration

​Pre-key ID counter and wrap-around

​Force-refreshing pre-keys for device migration

​Digest key validation

​Re-pair pre-key healing (v0.6)

​ADV companion identity validation

​Storage Integration

​Security Considerations

​Identity key trust

​Self-only protocol message gating

​Duplicate message detection

​Log level discipline

​Session state corruption

​Protocol safety limits

​Self-DM / sibling decryption recovery

​Performance optimizations

​Session object cache

​Arc previous sessions

​Redundant signal store write elimination

​Key reuse in cache

​Single-allocation session lock keys

​Single-buffer ProtocolAddress

​Reusable hot-loop address construction

​Zero-Allocation JID Deduplication

​Take/Restore Pattern

​Buffer Reuse

​Public API

​Related Components

​References

Overview

Architecture

Key Components

Double ratchet protocol

Session Initialization

Message Encryption

Message Decryption

Chain key ratcheting

Chain key overflow protection

Forward Jumps

DM device fanout

Device resolution

Device partitioning

Sender device exclusion

Own device recognition

PreparedDmStanza

PN→LID session migration

Signal address resolution

Why migration is needed

Proactive migration at LID discovery

On-the-fly migration during decryption

Login-time session check

Sender keys (group encryption)

Sender key address normalization

Sender key distribution

Group Encryption

Group Decryption

Unknown device detection

Retry receipt from unknown group device

Immutable sender key loading

Sender key existence check

Per-device sender key tracking

Parallelized group encrypt fan-out

In-memory sender key device cache

Phash validation for stale device list detection

WA Web phash parity (v0.6)

Cryptographic Primitives

AES-256-CBC (message content)

Thread-Local Buffers

HKDF-SHA256

PreKey Management

Configuration

Pre-key ID counter and wrap-around

Force-refreshing pre-keys for device migration

Digest key validation

Re-pair pre-key healing (v0.6)

ADV companion identity validation

Storage Integration

Security Considerations

Identity key trust

Self-only protocol message gating

Duplicate message detection

Log level discipline

Session state corruption

Protocol safety limits

Self-DM / sibling decryption recovery

Performance optimizations

Session object cache

Arc previous sessions

Redundant signal store write elimination

Key reuse in cache

Single-allocation session lock keys

Single-buffer ProtocolAddress

Reusable hot-loop address construction

Zero-Allocation JID Deduplication

Take/Restore Pattern

Buffer Reuse

Public API

Related Components

References