Nucleus Framework
Production-ready microservices foundation for Java / Spring Boot
Nucleus is a modular platform layer that provides the infrastructure every enterprise microservices application needs -- authentication, audit trails, messaging, monitoring, email, storage, and more. Add it as a Maven dependency and start building business logic on day one.
Quick Start
1. Add the parent POM
<parent>
<groupId>com.nucleus</groupId>
<artifactId>nucleus-parent</artifactId>
<version>0.0.1</version>
</parent>
2. Pick your modules
<dependencies>
<dependency>
<groupId>com.nucleus</groupId>
<artifactId>nucleus-core</artifactId>
<version>0.0.1</version>
</dependency>
<dependency>
<groupId>com.nucleus</groupId>
<artifactId>nucleus-audit-core</artifactId>
<version>0.0.1</version>
</dependency>
</dependencies>
3. Configure
spring:
config:
import: configserver:https://your-config-server
application:
name: my-service
profiles:
active: monitoring
Architecture
flowchart TB
subgraph NucleusApps["Nucleus runtime services"]
direction LR
NM["nucleus-monitoring
(deployed as a monitoring service)"]
NSP["nucleus-system-property
(deployed as a config service)"]
NAuth["nucleus-authentication
(deployed as an OAuth2 server)"]
NMail["nucleus-mail
(deployed as a mail service)"]
end
subgraph Libs["Nucleus libraries"]
direction LR
NC["nucleus-core
JWT · AI · workflow"]
NMC["nucleus-monitoring-client
heartbeat · CPU · GC · threads"]
NCon["nucleus-connectors
@NucleusListener / @NucleusPublish"]
NAC["nucleus-audit-core
@AuditAction"]
NSt["nucleus-storage
MinIO wrapper"]
end
subgraph Infra["Infrastructure"]
direction LR
K[("Kafka")]
M[("MinIO")]
DB[("MySQL / Postgres")]
CS["config-server"]
end
Libs --> NucleusApps
NucleusApps --> Infra
CS -.->|serves config| NucleusApps
nucleus-parent (Maven parent POM)
nucleus-bom (Bill of Materials)
nucleus-core (foundation: security, JWT, pagination, AI, workflow)
nucleus-connectors (@NucleusListener — 13 middleware backends)
nucleus-connectors-admin (runtime provisioning, REST + UI)
nucleus-monitoring-client (self-registration, remote logging)
nucleus-audit-core (@AuditAction declarative audit)
nucleus-storage (MinIO/S3 file storage)
nucleus-execution-events (async job framework)
nucleus-authentication (OAuth2/OIDC server)
nucleus-mail-common (shared DTOs)
nucleus-mail (rendering + sending + step tracking)
nucleus-mail-renderer (Thymeleaf template engine)
nucleus-mail-sender (SMTP delivery)
nucleus-ui-message-broker (SSE with role-based delivery)
nucleus-config (Spring Cloud Config Server)
nucleus-user (user management + roles)
nucleus-address (Google Maps integration)
nucleus-core
Foundation
The base layer every Nucleus service extends. Provides security filters, JWT validation, pagination, error handling, and report generation.
- BaseComponent -- base class with common utilities (dialog, navigation, error handling)
- JWT Security -- automatic JWT validation via Spring Security resource server
- Pagination -- generic Page, PageResult, SearchService interfaces
- Report Generation -- PDF, Excel, CSV export with built-in file viewer
- Permission Checker -- role-based access utilities
- ControllerErrorHandler -- centralised
@RestControllerAdvicetranslating domain exceptions to HTTP responses
ControllerErrorHandler
A single @RestControllerAdvice in nucleus-core maps every domain exception to the right HTTP status + body, so services never have to write their own try/catch wrappers at the controller layer. Throw the right exception, get the right response.
| Exception | HTTP | Body | Typical use |
|---|---|---|---|
ResourceNotFoundException | 404 | Message surfaced | Entity lookup miss |
DuplicateResourceException | 409 | Message surfaced | Unique-constraint violations |
InvalidResourceStateException | 409 | Message surfaced | Illegal state transitions (e.g. restore of already-restored recording) |
IllegalStateException | 409 | Message surfaced | Business-rule conflicts (e.g. overlapping schedule windows) |
IllegalArgumentException | 400 | Message surfaced | Caller-side invalid input |
HttpMessageNotReadableException | 400 | Terse | Malformed request JSON |
AccessDeniedException | 403 | Terse | Spring Security denials |
Exception (fallback) | 500 | Terse | Unhandled failures — logged in full |
Guideline: throw IllegalStateException for business-rule conflicts that aren't duplicates, IllegalArgumentException for bad input the controller didn't catch, and the Nucleus-specific exceptions when you want a domain-level error surfaced verbatim to the client.
nucleus-core · AI Framework
AI
Provider-agnostic LLM client. Callers build a provider-agnostic AiRequest and hand it to AiClient; the client routes to the configured provider and returns a strongly-typed AiResult<R>. No exceptions leave execute.
AiRequest request = AiRequest.builder()
.provider(AiProvider.ANTHROPIC)
.model("claude-3-5-sonnet")
.systemPrompt("You are a transaction categorizer. Respond with JSON only.")
.userPrompt(description)
.temperature(0.2)
.build();
AiResult<CategoryResponse> result = aiClient.execute(request, CategoryResponse.class);
if (result.isSuccess()) {
return result.getData();
}
// result.getErrorCode() is one of: RATE_LIMIT, HTTP_ERROR, TIMEOUT,
// PARSING_ERROR, INVALID_RESPONSE, UNKNOWN
Providers
- OpenAI — Chat Completions,
response_format=json_object, multimodal viaimage_urldata blocks. - Anthropic — Messages API,
x-api-keyauth, multimodal via base64 image blocks,max_tokensdefaults to 4096. - Add your own — three beans (
AIProxy,AiRequestBuilder,AiResponseExtractor) plus an enum constant. Registered automatically at startup.
Features
- Strategy-pattern architecture — per-provider request builders and response extractors resolved via
provider(). - Retry on transient failures only (connect errors, timeouts) — never on HTTP error responses.
- Configurable timeout, retry count, backoff.
- Multimodal support (
AiImage) encoded provider-natively. - Errors classified into six semantic codes so callers can react consistently across providers.
nucleus-core · Workflow + Step Executors
Processing
Nucleus-standard way to model multi-step async business flows. Each step runs on its own Kafka topic (horizontal scaling per stage), has three outcomes — success, not-handled (delegate to next step), technical failure (abort) — and the orchestrator stays declarative.
flowchart LR Start(["@WorkflowStart
startWorkflow(req)"]) --> S1["@WorkflowStep
onRuleEngine"] S1 -->|success| OK1(["publishSuccess
workflow.succeed()"]) S1 -->|notHandled
delegate| S2["@WorkflowStep
onClientLevel"] S2 -->|success| OK2(["publishSuccess"]) S2 -->|notHandled
delegate| S3["@WorkflowStep
onBankLevel"] S3 -->|success| OK3(["publishSuccess"]) S3 -->|notHandled
delegate| S4["@WorkflowStep
onAIClassification"] S4 -->|success| OK4(["publishSuccess"]) S1 -.->|technicalFailed| Abort(["workflow.failed
ErrorCode + reason"]) S2 -.->|technicalFailed| Abort S3 -.->|technicalFailed| Abort S4 -.->|technicalFailed| Abort S4 -->|notHandled| Manual(["manual review queue"])
@WorkflowStart
public void startWorkflow(Request req) {
workflow.start();
workflow.dispatchRuleEngineClassification(req);
}
@WorkflowStep(mode = WorkflowStepMode.QUEUE)
public void onRuleEngine(RuleEngineRequest req) {
var result = ruleStep.handle(req);
if (result.isSuccess()) {
resultService.publishSuccess(req, result.response);
workflow.succeed();
} else if (result.isNotHandled()) {
workflow.dispatchAIClassification(req.asAI());
} else {
workflow.failed(ErrorCodes.TECHNICAL_FAILURE, result.message);
}
}
@WorkflowStart/@WorkflowStep/@WorkflowEndannotations mark the lifecycle.WorkflowStepMode.QUEUEroutes the step through a Kafka topic — different services can own different steps.StepExecutorreturnsStepResult(success,notHandled,technicalFailed). Same three-way branch in every step.- Reference implementations: expense OCR pipeline, transaction classification (rule-engine → client-level → bank-level → AI).
nucleus-authentication
Security
Full OAuth2/OIDC authorization server built on Spring Authorization Server.
- JWT tokens with RSA rotating keys
- Login/logout audit trail
- Role-based access (ROLE_ADMIN, ROLE_CLIENT, ROLE_AUDITOR, etc.)
- Configurable token expiry
- Branded login page with Thymeleaf templates
nucleus-audit-core
Compliance
Add audit trails to any service method with a single annotation.
@AuditAction(
action = "Update Expense",
entityName = "Expense",
description = "Updated expense #{#result.id} for #{#result.vendor}"
)
public Expense updateExpense(ExpenseDTO dto) { ... }
- SpEL expressions for dynamic descriptions
- Automatically captures username, timestamp, entity ID
- Published to Kafka for real-time audit dashboards
- Stored in audit database with full history
nucleus-connectors
Messaging
Middleware-agnostic messaging framework. One @NucleusListener annotation works against 13 different brokers. Every vendor adapter conforms to the same SPI — a developer writing a listener doesn't know or care what's on the other end of the wire.
flowchart LR
subgraph AppCode["Application code"]
L["@NucleusListener
onOrder(Order o)"]
P["@NucleusPublish
TopicPublisher"]
end
subgraph SPI["Nucleus connector SPI"]
R["MessageRouter
format + serialization"]
end
subgraph Backends["13 backend adapters"]
direction TB
K["Apache Kafka
(+ MSK, Event Hubs)"]
RMQ["RabbitMQ
(+ Amazon MQ)"]
AWS["SQS · SNS · Kinesis"]
GCP["Google Pub/Sub"]
AZ["Azure Service Bus
Queue Storage"]
JMS["JMS: ActiveMQ
Artemis · Solace · IBM MQ"]
SO["Solace (native JCSMP)"]
end
L --> R
P --> R
R --> K
R --> RMQ
R --> AWS
R --> GCP
R --> AZ
R --> JMS
R --> SO
flowchart TB
subgraph Broker["Shared Kafka broker"]
direction TB
ACME_H["acme.service.health"]
ACME_L["acme.remote.logs"]
WIDGETS_H["widgets.service.health"]
WIDGETS_L["widgets.remote.logs"]
N_H["nucleus.service.health
(default, no override)"]
end
subgraph Deploy1["Product deployment A
nucleus.application=ACME"]
D1["services"]
D1Props["PROPERTIES override:
kafka.service.health.topic.name=
acme.service.health"]
end
subgraph Deploy2["Product deployment B
nucleus.application=WIDGETS"]
D2["services"]
D2Props["PROPERTIES override:
kafka.service.health.topic.name=
widgets.service.health"]
end
subgraph Nuc["Standalone Nucleus
nucleus.application=NUCLEUS (default)"]
N1["services (no override)"]
end
Deploy1 --> ACME_H
Deploy1 --> ACME_L
Deploy2 --> WIDGETS_H
Deploy2 --> WIDGETS_L
Nuc --> N_H
@NucleusListener(
connector = "orders",
topic = "${orders.topic}",
format = MessageFormat.JSON,
mode = ConsumeMode.QUEUE
)
public void onOrder(Order o) {
// first parameter type drives deserialization — no manual JSON handling
}
Supported backends
| Category | Backends | Module |
|---|---|---|
| Apache Kafka | Kafka, Amazon MSK, Azure Event Hubs (Kafka-compatible) | nucleus-connectors-kafka |
| AMQP | RabbitMQ, Amazon MQ (RabbitMQ engine) | nucleus-connectors-rabbitmq |
| AWS | SQS, SNS, Kinesis Data Streams | nucleus-connectors-sqs / -sns / -kinesis |
| Google Cloud | Pub/Sub | nucleus-connectors-pubsub |
| Azure | Service Bus, Queue Storage | nucleus-connectors-servicebus / -azurequeue |
| JMS family | ActiveMQ, Artemis, Solace (JMS), IBM MQ | nucleus-connectors-jms-* |
| Solace native | PubSub+ via JCSMP — topic wildcards, Direct messaging, replay | nucleus-connectors-solace |
Runtime provisioning
Connectors are addressable by logical name. Instances can be created at runtime through nucleus-connectors-admin — REST + UI flow to add a new broker, validate it, and start consuming without restarting the service. Ideal for multi-tenant deployments where each customer's cluster is onboarded on demand.
Vendor-specific features
Where a backend exposes something the generic SPI can't represent — Kafka partition keys, RabbitMQ exchanges, Solace topic hierarchies — opt-in meta-annotations ride alongside the core annotation:
@NucleusListener(connector = "orders", topic = "orders/>/ca/*", format = MessageFormat.JSON)
@Solace(deliveryMode = GUARANTEED, flowProfile = "high-throughput")
public void onOrder(Order o) { ... }
The core annotation stays generic; vendor extensions are additive, never parallel APIs.
Module selection
- Only include the adapter modules you need — vanilla deployments carry zero third-party broker code.
- Licensed JARs (IBM MQ, Solace) are scoped to their own modules; customers inherit the license only if they depend on those modules.
@NucleusPublish
Field-level annotation that injects a pre-configured TopicPublisher, eliminating publisher wrapper boilerplate. Mirrors @NucleusListener for the publish side.
Before (one class per topic):
@Component
@RequiredArgsConstructor
public class OrderPublisher implements DataPublisher<Order> {
@Value("${orders.topic}") private final String topic;
@MiddlewareQualifier(type = KAFKA)
@MessageFormatQualifier(format = JSON)
private final Publisher publisher;
public void publish(Order o) { publisher.publish(topic, o, null); }
}
After (inline field):
@Component
public class OrderService {
@NucleusPublish(topic = "${orders.topic}")
private TopicPublisher orderPublisher;
public void placeOrder(Order o) {
// ... business logic ...
orderPublisher.send(o);
}
}
| Attribute | Default | Description |
|---|---|---|
topic | required | Destination. Supports ${placeholders} |
connector | "" (primary) | Logical connector name — "kafka", "rabbitmq", etc. |
format | JSON | Serialization format |
transactional | true | Use Kafka transactions (set false for fire-and-forget) |
TopicPublisher API:
| Method | Description |
|---|---|
send(payload) | Send to the pre-configured topic |
send(payload, key) | With partition key |
send(payload, headers) | With custom headers |
send(payload, key, headers) | Both key and headers |
Lombok compatibility: The field must be non-final — it's injected by a BeanPostProcessor after construction. @RequiredArgsConstructor works fine for other final fields; the publisher field is simply excluded from the constructor.
nucleus-monitoring-client
Observability
Lightweight service monitoring. Add the dependency, enable the monitoring profile, and your service self-registers with health snapshots, remote logging, thread dumps, and GC events — all via Kafka push. The monitoring service aggregates the streams into a fleet registry, drives a Log Explorer with infinite scroll + a timeline playback viewmode, and powers the Session Recordings capture-and-replay feature.
Configuration Properties (PROPERTIES table)
Global (PROFILE=monitoring):
| Property | Default | Description |
|---|---|---|
nucleus.monitoring.remote-logging.auto-attach | true | Auto-attach KafkaLogAppender to root logger at boot (no logback.xml needed) |
nucleus.monitoring.remote-logging.allowed | true | Global default — whether remote logging is permitted |
nucleus.monitoring.publish-interval-ms | 10000 | Health snapshot publish frequency (ms) |
nucleus.monitoring.stale-check-ms | 30000 | Time before a service is considered stale |
nucleus.monitoring.log-retention.days | 7 | Days before logs are archived to MinIO and purged from DB |
Per-service (PROFILE=bookwise-{service-name}):
| Property | Default | Description |
|---|---|---|
nucleus.monitoring.remote-logging.allowed | true | Hard lock — false rejects ENABLE_LOGS, UI toggle greyed out |
nucleus.monitoring.remote-logging.enabled | false | Auto-activate remote logging at boot |
nucleus.monitoring.publish-interval-ms | 10000 | Override health publish frequency |
Remote Logging Scenarios
| Scenario | logback.xml | enabled | allowed | Behaviour |
|---|---|---|---|---|
| A1 | REMOTE + ASYNC | true | true | Streams at boot |
| A2 | REMOTE + ASYNC | false | true | Inactive, UI-togglable |
| A4 | REMOTE + ASYNC | any | false | Hard locked — commands rejected |
| B | none | false | true | Auto-attached at runtime, UI-togglable |
Logback Setup (scenarios A1/A2)
<appender name="REMOTE" class="com.nucleus.monitoring.client.KafkaLogAppender" />
<appender name="ASYNC_REMOTE" class="ch.qos.logback.classic.AsyncAppender">
<appender-ref ref="REMOTE"/>
<queueSize>4096</queueSize>
<discardingThreshold>0</discardingThreshold>
<neverBlock>true</neverBlock>
</appender>
<root level="INFO">
<appender-ref ref="ASYNC_JSON"/>
<appender-ref ref="ASYNC_REMOTE"/>
</root>
Instance-scoped "Keep across restarts"
The service-monitor UI has a Keep across restarts toggle on each control track (remote-logging, thread-stream, gc-logs). When it's on and the operator presses Enable, the state persists so the specific instance boots with the feature active.
Row shape written to PROPERTIES:
| APPLICATION | PROFILE | LABEL | PARAM_KEY | VALUE |
|---|---|---|---|---|
BOOKWISE | bookwise-transaction-service | 1 | nucleus.monitoring.instance.<id>.remote-logging.enabled | true |
Instance scoping is encoded in PARAM_KEY (prefix nucleus.monitoring.instance.<instanceId>.) rather than the LABEL column. Spring Cloud Config's JDBC backend only queries a single label per request, so a comma-separated label list (${instance-id},1) is treated as one literal and returns nothing. Keeping every row at LABEL='1' avoids that.
Read path at boot uses a nested @Value:
@Value("${nucleus.monitoring.instance.${nucleus.monitoring.instance-id:default}.remote-logging.enabled:${nucleus.monitoring.remote-logging.enabled:false}}")
private boolean enabledAtBoot;
The outer placeholder ${nucleus.monitoring.instance-id} is populated at bootstrap by InstanceIdEnvironmentPostProcessor (deterministic UUIDv3 of serviceName@host:port, overrideable via NUCLEUS_MONITORING_INSTANCE_ID). Resolution: instance row wins, falls back to the service-wide row at LABEL='1', then to the code default.
Write path: target service's EnableLogsHandler returns __persistKey/__persistValue on the ack; ServiceControlAckListener in nucleus-monitoring upserts the row through system-property-service's /upsert endpoint (allowed for ROLE_ADMIN or SCOPE_internal.access).
Data Persistence
| DB Table | Purpose | Consumer Group |
|---|---|---|
service_health_snapshot | Per-instance health metrics: CPU load, heap used/max, non-heap, thread count, status, uptime, DB pool, disk; schedule_id tags rows inside a Session Recording window and drives the CPU + heap timeline track | monitoring-health-persistence |
remote_log_event | Persisted log events (tagged with schedule_id when captured inside a Session Recording window) | monitoring-log-persistence |
gc_event | GC pause events (tagged with schedule_id) | monitoring-gc-persistence |
thread_dump_snapshot | Thread dumps JSON (tagged with schedule_id) | monitoring-thread-persistence |
After log-retention.days, ambient data is purged from DB. Session Recording windows are archived to MinIO as hierarchical ZIP bundles (see Session Recordings below).
Architecture — pure microservice boundary
The monitoring service touches only its own tables via JPA. Anything owned by another service is reached over authenticated REST through a gateway. Concretely:
flowchart LR
subgraph Consumer["Consumer service (needs foreign data)"]
direction TB
Handler["Business handler"]
Gateway["XxxGateway"]
Token["ServiceTokenManager
cached client credentials"]
Client["XxxControllerApi
generated WebClient"]
Handler --> Gateway
Gateway -->|getToken| Token
Gateway --> Client
end
subgraph Target["Target service (owner of the data)"]
direction TB
API["REST endpoint"]
Auth["JWT auth filter
ROLE_INTERNAL / ROLE_ADMIN"]
Repo["JPA repository"]
DB[("Own DB")]
API --> Auth --> Repo --> DB
end
Forbidden["Forbidden path
no JdbcTemplate
no cross-service SQL
no foreign @Query"]
Client -->|HTTPS + Bearer| API
Handler -.-x Forbidden
Forbidden -.-x DB
style Gateway fill:#ede9fe,stroke:#7c3aed
style Client fill:#dbeafe,stroke:#3b82f6
style DB fill:#dcfce7,stroke:#16a34a
style Forbidden fill:#fee2e2,stroke:#dc2626,color:#991b1b
SystemPropertyGatewaywrites per-service / per-instance config (remote-logging flags, levels, gc-logs enable, thread-stream interval) toPROPERTIESviaPOST /system-property/upsert— never directly. Standard pattern: injectServiceTokenManager, stamp each call with a service token, delegate to the generatedSystemPropertyControllerApi.- The tenant name is injected from the
nucleus.applicationproperty (defaultNUCLEUS); each ecosystem overrides it. The nucleus source contains zero references to any product's name in architectural code. - Kafka topic defaults are tenant-neutral (
nucleus.service.health,nucleus.remote.logs, …); each ecosystem overrides viaPROPERTIESto its own prefix so multiple Nucleus deployments can share one Kafka broker without colliding. - No
JdbcTemplate, no native SQL against foreign tables, no parallel config tables inside nucleus.
Session Recordings
A Session Recording is a time-bounded capture of logs + GC events + thread dumps + CPU/heap samples for a specific service instance, replayable as a synchronized timeline. Recordings can be scheduled ahead of time or started on the spot from the Service Monitor (Record Now), and archived to MinIO for long-term retention. The shared timeline has five rows (logs · GC · threads · CPU · heap) driven by a single cursor, plus a dedicated CPU & Heap tab with a SVG chart. Per-instance config for each recording (remote-logging enabled, levels, gc-logs, thread-stream interval) is persisted through the SystemPropertyGateway — nucleus never writes to another service's tables.
flowchart LR
subgraph Source["Monitored service (any nucleus-based service)"]
direction TB
App["Application code"]
MC["nucleus-monitoring-client
• CpuSampler
• KafkaLogAppender
• GcEventStreamer
• ThreadStreamScheduler"]
App -.->|logs / JMX / JVM events| MC
end
subgraph Kafka["Kafka topics
nucleus.* defaults · per-deployment override"]
direction TB
TH["service.health"]
TL["remote.logs"]
TG["service.gc-events"]
TT["service.thread-stream"]
end
subgraph Mon["nucleus-monitoring (runtime service)"]
direction TB
Lst["Persistence listeners"]
Rsv["ActiveScheduleResolver
tags schedule_id
if inside a window"]
DB[("Own DB:
remote_log_event
gc_event
thread_dump_snapshot
service_health_snapshot")]
Lst --> Rsv
Rsv --> DB
end
subgraph UI["Session Recordings UI"]
direction TB
TL2["Shared timeline
5 density tracks"]
Tabs["Logs · GC · Threads · CPU+Heap"]
end
Bundle[("MinIO
recordings/host/.../file.zip
+ file.meta.json")]
MC -->|publish| Kafka
Kafka -->|consume| Lst
DB -->|"REST /monitor/schedules/*
density + paginated"| UI
DB -.->|archive: detach schedule_id| Bundle
Bundle -.->|restore: reinsert rows| DB
Each recording flows through three states:
- Live — rows live in
remote_log_event/gc_event/thread_dump_snapshottagged with the schedule's ID; replayed from DB. - Archived — moved to MinIO as a ZIP bundle +
.meta.jsonsidecar. DB rows are detached (schedule_idset toNULL) so they remain visible to the regular Log Explorer query for normal retention; MinIO is the durable copy of the recording. - Permanently deleted — bundle removed from MinIO. Cannot be undone.
Capture features
| Feature | Description |
|---|---|
| Scope | Logs, GC events, thread dumps — any combination, toggleable per schedule |
| Time windows | Start/end time with timezone-aware scheduling; overnight windows supported (e.g. 23:00–05:00) |
| Recurrence | ONCE / DAILY / WEEKDAYS / WEEKENDS |
| Level filtering | Restrict to TRACE / DEBUG / INFO / WARN / ERROR (comma-separated) |
| Thread-dump interval | Configurable 1s / 2s / 5s / 10s for streaming thread snapshots throughout the window |
| Notes / tags | Free-form text on the schedule (e.g. "investigating BOOK-289 slow login"); surfaced on the recording header, Log Explorer badge, and archive audit description |
| Overlap guard | Creating or editing a schedule returns HTTP 409 if its window overlaps an existing schedule for the same service + instance |
| Window cap | 12 h server-side safety ceiling (capScheduleWindow); UI further caps open-ended Record Now at 4 h |
| Auto-deactivation | One-time schedules deactivate on window close; unified activation path always writes ACTIVATED + idempotent ENABLE_LOGS |
Record Now
A one-click button on each Service Monitor detail panel starts a recording against the current service instance immediately — no schedule dialog. Presets: 5m / 15m / 1h / 2h / Until I stop. All presets prompt for an optional notes tag before submitting. An explicit Stop button and red pulsing outline persist across page refreshes (tracked in localStorage).
In-progress indicators
- Global header — red pulsing dot with count badge, visible to
ROLE_ADMIN. Polls/monitor/schedules/activeevery 20 s. Click routes to the Session Recordings page. - Per-service dot — red pulsing dot on any Service Monitor card whose service has an active recording.
MinIO layout
recordings/{host}/{service}/{instance}/{yyyy}/{MM}/{dd}/{scheduleId}-{executionId}-{HHmmss}.zip
recordings/{host}/{service}/{instance}/{yyyy}/{MM}/{dd}/{scheduleId}-{executionId}-{HHmmss}.meta.json
ZIP contains manifest.json + logs.json + gc-events.json + thread-dumps.json. The sidecar .meta.json carries counts, capture flags, notes, and the bundleKey — used by the browser to render a preview panel without opening the ZIP.
UI — Administration → Logs → Session Recordings
Layout & navigation
- Split pane with a full-width breadcrumb bar above it; left rail is the file-manager browser, right rail is the recording detail.
- LIVE ↔ ARCHIVE toggle chips — switch between DB-backed recordings and MinIO archive without leaving the page. Both views share the same browser, so users only learn one navigation pattern.
- Breadcrumb + file-manager browser (replaces the old 6-level nested tree). One level at a time:
Recordings ▸ host ▸ service ▸ instance ▸ yyyy ▸ MM ▸ dd ▸ recording. Every crumb is clickable. - Hostname labels — path segments stay as IPs for URL stability, but labels render the resolved hostname (damian-HP-EliteDesk-800-G3-DM-35W) instead of 127.0.1.1.
- Current-level filter input top-right — Finder-style scoping (filters only what's at the current hop) and auto-clears on every navigation.
- Bookmarkable URL state — navigation round-trips through
?view=&p=&sel=usingreplaceUrl: trueso crumb clicks don't flood the history stack. Copy-paste the address bar to share a direct link to any recording. - Maximize mode —
open_in_fullicon expands the detail pane to full screen for deep investigation without the browser.
Recording header
- Schedule label (
service [instance] @ hostname), window range, counts (116 logs · 0 gc · 24 threads). - Notes badge — pill with a
labelicon showing thenotestag when present.
Shared timeline (above Logs / GC / Threads tabs)
- Log density histogram — pre-bucketed counts from
/logs/density, coloured by severity (DEBUG / INFO / WARN / ERROR). Click a bucket in Classic Log Explorer view to scroll the Logs tab to that timestamp. - GC marks and thread-capture marks overlaid on the same timeline — one glance tells you when GC pauses happened relative to the log burst you're investigating.
- Tracked-thread lane — the headline feature for deadlock + contention investigation. Pick a thread from the Track thread picker (or click a row in the Threads tab) and a dedicated horizontal lane appears showing that thread's state transitions across the entire window. Segments are coloured by state — green RUNNABLE, red BLOCKED, amber WAITING, blue TIMED_WAITING — so you can see at a glance when a worker was blocked on a lock, how long it waited on a condition, or when a scheduled task fired. The picked thread is also pinned to the top of the snapshot table in the Threads tab.
- Draggable cursor + play/pause drives every tab. Logs and GC auto-scroll to follow the cursor; the Threads tab snaps to the snapshot closest to the cursor.
- Shift-drag brush-select — highlights a sub-range in light blue with a chip showing the range. Filters Logs + GC tabs to that window without changing the timeline's from/to. Esc or right-click clears the selection.
- Zoom presets — 1m / 5m / 15m / 1h snap to a window ending at the recording's end. Full restores the original window (we capture
originalFromMs/originalToMsso you can always zoom back out after a preset).
Tabs
- Logs — rows within cursor ±2.5 s highlight with
.le-row-active; auto-scroll centres the active row viascrollIntoView({ block: 'center' }). - GC Events — same cursor highlight + auto-scroll.
- Threads — service-monitor-parity table with Name / State / CPU / Blocked / Lock / Top-frame columns. Rows expand for the full stack trace.
TIMED_WAITINGrenders in blue to visually distinguish fromWAITING(amber).
Recording actions
- Download bundle (live) — streams a fresh ZIP from
/monitor/schedules/{id}/bundle. - Archive (live → MinIO) — writes the bundle, detaches DB rows. SweetAlert2 confirmation with the shared
#4169E1Yes/No styling. - Download ZIP (archive) — the pre-built MinIO bundle.
- Restore (archive → live) — reinserts rows + removes the MinIO copy.
- Delete permanently (archive) — hard-delete from MinIO, no recovery.
- All confirmations use
BaseComponent.confirm(); success paths fireBaseComponent.showSuccess()toasts; the browser pane re-slices / re-fetches after each mutation so the left menu stays in sync.
Log Explorer (Administration → Logs → Log Explorer)
- Classic viewmode — infinite-scroll Logs (200-px threshold, 10,000-row DOM cap with a bottom sentinel showing Loading more… / End of results · N rows / Stopped at 10,000 rows), mat-paginator on GC + Threads.
- Timeline viewmode mirrors the Session Recordings experience for the non-recording-scoped log query — 5 density tracks (logs / GC / threads / CPU / heap), shared cursor, playback, shift-drag brush-as-filter, tracked-thread lane, notes badge when the active row's schedule has a tag.
- Reactive filters — query fires on selection change, no explicit Query button needed. Zoom presets use the captured original range so Full always restores the user's original window.
Service Monitor (where Record Now lives)
- Per-service red pulsing dot on every card whose service has an active recording. Driven by
recordingServices: Set<string>populated from a 20 s poll of/monitor/schedules/active. - Record Now split-button with duration-preset menu (5m / 15m / 1h / 2h / Until I stop). Always prompts for an optional notes tag via a SweetAlert2 text input.
- Stop button + red pulsing outline on the Record Now button while a quick recording is active.
- localStorage-backed state —
quickRecordingSchedulesurvives page refresh so the indicators reappear correctly. - Schedule dialog — notes field, thread-stream interval chips (1 s / 2 s / 5 s / 10 s), level-filter chips, recurrence picker, capture toggles for logs / GC / threads.
Global + cross-page
- Global header indicator — red pulsing dot + count badge, visible only to
ROLE_ADMIN, polls every 20 s. Clicking routes to the Session Recordings page. - Audit Report deep-link — every Archive Session Recording / Restore / Delete row carries the MinIO
bundleKeyas itsentityKey. The Audit Report's star column renders anopen_in_newicon that jumps back to the bundle in the archive browser (restoreBrowserFromUrlauto-selects the leaf on arrival).
REST API
| Method | Endpoint | Description |
|---|---|---|
| GET | /monitor/schedules | List all schedules |
| GET | /monitor/schedules/active | Currently-active schedules across all services (powers the indicators) |
| POST | /monitor/schedules | Create schedule — 409 on overlap |
| PUT | /monitor/schedules/{id} | Update schedule — 409 on overlap |
| DELETE | /monitor/schedules/{id} | Stop & delete schedule |
| GET | /monitor/schedules/{id}/logs | Paginated logs for a recording |
| GET | /monitor/schedules/{id}/gc | Paginated GC events |
| GET | /monitor/schedules/{id}/threads | Paginated thread dumps |
| GET | /monitor/schedules/{id}/bundle | Stream the full recording as a ZIP |
| DELETE | /monitor/schedules/{id}/executions/{execId} | Archive — write bundle to MinIO, detach DB rows |
| GET | /monitor/recordings/browse?prefix= | Lazy one-level browse of archived bundles |
| GET | /monitor/recordings/archived/download?key= | Download archived ZIP |
| POST | /monitor/recordings/archived/restore?key= | Restore bundle back into DB |
| DELETE | /monitor/recordings/archived?key= | Permanently delete from MinIO |
DB tables: remote_log_schedule (service_name, instance_id, start/end time, timezone, recurrence, levels, threads_interval_ms, notes, persistent flag, active flag) and remote_log_schedule_execution (one row per ACTIVATED / DEACTIVATED / SKIPPED event). All endpoints require ROLE_ADMIN and are audit-logged.
nucleus-ui-message-broker
Real-time
Server-Sent Events (SSE) with role-based delivery for real-time UI updates.
- Single SSE endpoint per user -- multiplexes all message types
- Role-based targeting: send to ROLE_ADMIN, specific users, or broadcast
- Message expiration -- stale messages auto-expire
- Used for: service health, job progress, notifications, live dashboards
Control Events — UserControlEvent + SessionControlEvent
Security
A separate high-priority lane for security/control imperatives the UI must act on immediately. Rides alongside normal notifications but never queues behind them — the broker uses a per-subscriber PriorityBlockingQueue so a SESSION_REVOKED arriving 10ms after a routine EXPENSE_APPROVED still flushes first.
Two event types, one envelope:
UserControlEvent-- targets one specific user (any number of their open sessions). Broker filters byusername == jwt.username.SessionControlEvent-- targets every connected session. Broker fan-outs to all SSE subscribers.
Currently planned kinds:
SESSION_REVOKED— user disabled, UI logs out immediatelyFORCE_PASSWORD_RESET— admin forced reset, UI redirects to reset pagePERMISSION_CHANGED— role added/removed, UI silently refreshes authorities (no logout)LOCATION_PREFERENCE_CHANGED— admin overrode default location, UI updates location signalMAINTENANCE_NOW— emergency maintenance, blocking modal + auto-logout (SessionControlEvent)BANNER_UPDATE,FORCE_RELOAD— system-wide notices (SessionControlEvent)
Topic shape: two Kafka topics per ecosystem — ${product}.ui.control (single-partition, low volume, ordering matters) and ${product}.ui.notifications (multi-partition, normal traffic). Both cleanup.policy=delete with 24h retention.
Defense-in-depth stack: control events are layer 2. The hard floor is layer 1 (short access-token TTL, refusing refresh for disabled users); layer 3 is per-mutation requireUserEnabled(user) checks in service code. Each layer covers what the others can't.
Modules: nucleus-user-control-events (POJOs), nucleus-user-control-publisher (publish API), nucleus-ui-message-broker (the SSE service, one deployment per ecosystem).
Token Revocation — JWT denylist + disabled-user filter cache
Security
The UI control channel above handles active sessions with millisecond responsiveness. But a stolen JWT used outside the browser (curl, an attacker, a stale tab without SSE) doesn't go through the UI — it goes straight to a REST endpoint. That's where this layer kicks in.
JWS as a standard is stateless: signature + exp claim is the entire validation story. To revoke before exp, every service holds a local denylist populated via Kafka events on a separate, dedicated topic (${product}.token.revocation) — distinct from the UI control topic.
Two caches per service:
JtiDenylistCache—Map<jti, expiresAt>; populated byTOKEN_REVOKEDevents; self-prunes whenexpiresAt < now.DisabledUserCache—Set<username>; populated byUSER_DISABLED/USER_ENABLED; reconciled periodically against user-service for drift.
Both per-node, in-heap (ConcurrentHashMap). No shared infrastructure — each service holds its own copy.
TokenRevocationFilter runs after JWT signature/exp validation, before controller dispatch:
- Extract
jti,usernamefrom validated JWT. - If
denylistCache.contains(jti)→ 401, "Token revoked". - If
disabledUserCache.contains(username)→ 401, "Account disabled". - Otherwise → continue.
Event kinds (token revocation topic):
TOKEN_REVOKED— logout endpoint, after refresh-token invalidationUSER_DISABLED— user-service afterenabled=falsecommitsUSER_ENABLED— user-service afterenabled=truecommitsUSER_TOKENS_REVOKED— forced password reset, role-revocation flow; records "minimum issued-at" timestamp; lets user log back in immediately with a fresh token
Why a separate topic from the UI control channel? Different consumers, different blast radii, different failure-mode tolerances. The broker is one consumer; the auth filter runs in every service. A misbehaving broker shouldn't be able to break the auth path. Two topics, cheap to have.
Defense-in-depth: this is layer 1b. Layer 1a is JWT signature/exp validation (nucleus-authentication). Layer 2 is the SSE control channel above. Layer 3 is per-mutation requireUserEnabled(user) guards in service code. Each layer covers what the others can't.
Module: nucleus-token-revocation — auto-registers the filter when on the classpath. Same pattern as nucleus-monitoring-client.
nucleus-mail
NEW Communication
Kafka-driven email pipeline with three stages: render, send, track. 2026-04 consolidation: nucleus-mail-renderer and nucleus-mail-sender were merged into nucleus-mail; the pipeline now ships as one deployable with internal stages. Standalone repos are archived.
- nucleus-mail-common -- shared DTOs for publishing mail requests
- nucleus-mail -- unified library: rendering + sending + tracking
- nucleus-mail-renderer -- Thymeleaf template engine
- nucleus-mail-sender -- SMTP delivery with retry
- Full delivery tracking: queued, rendered, sent, delivered, failed
- Resend capability via REST endpoint
nucleus-storage
Storage
MinIO/S3-compatible object storage abstraction.
- Upload, download, delete with standardized paths
- Path pattern:
{clientId}/{entityId}/filename.ext - Pluggable backend -- swap MinIO for AWS S3
nucleus-execution-events
Processing
Async job execution framework for long-running processes.
- Job / Step / Activity lifecycle events via Kafka
- Real-time progress updates pushed to UI via SSE
- Persisted in database for history and reporting
nucleus-config
Configuration
Spring Cloud Config Server backed by MySQL database.
- All services pull configuration at startup
- Change config without redeployment
- Environment-specific profiles
- Encrypted properties for secrets
nucleus-user
Identity
User management, role assignment, and access request workflows.
nucleus-address
NEW Integration
Google Maps Places API integration for address autocomplete and validation. Single module — ships the runnable service, the shared Address JPA entity, payloads, and MapStruct mappers used by every Nucleus service that stores addresses. (Previously two modules; nucleus-address-common was merged in.)
Managed AddressType values (27 enum-mirror seed rows) live as a bucket in nucleus-reference-data, not a per-entity admin endpoint.
nucleus-reference-data & nucleus-reference-data-service
NEW Framework
Generic bucketed primitive for any "managed list of codes" — address types, classification labels, status enumerations, lookup tables. One library + one entity (ReferenceDataItem) + one table (reference_data_item) replaces the per-entity admin services we used to build for every code list.
Two artifacts:
nucleus-reference-data— library: entity, payloads, repository, service, and theReferenceDataSecuritySPI.nucleus-reference-data-service— Pattern A standalone deployment (env-var-driven,spring-cloud-starter-config, nonucleus-configornucleus-authentication).
Schema — one table, bucket discriminator (nucleus-db-schema):
| Column | Type | Notes |
|---|---|---|
id | BIGINT (PK) | Auto-generated |
bucket | VARCHAR | Discriminator (AddressType, FooType, …) |
code | VARCHAR | Stable code (uppercase, unique within bucket) |
label | VARCHAR | Human-readable |
sort_order | INT | UI ordering hint |
active | BOOLEAN | Soft-disable hides from selectors |
system_owned | BOOLEAN | Locks seeded enum-mirror rows from delete/rename |
REST API (mounted at /reference-data):
| Method | Path | Auth (delegated) |
|---|---|---|
| GET | /{bucket} | canView |
| GET | /{bucket}/{id} | canView |
| POST | /{bucket} | canManage |
| PUT | /{bucket}/{id} | canManage |
| DELETE | /{bucket}/{id} | canDelete |
| POST | /{bucket}/filter | canView |
| GET | /{bucket}/references | authenticated |
| POST | /{bucket}/report/{pdf|excel|csv} | canReport |
Binary report endpoints carry @Operation + @ApiResponse declaring the matching mediaType — the typescript-angular generator emits responseType: 'blob' instead of defaulting to JSON. CSV writer is wrapped in try-with-resources.
Authorization — ReferenceDataSecurity SPI:
public interface ReferenceDataSecurity {
boolean canView(String bucket, Authentication auth);
boolean canManage(String bucket, Authentication auth);
boolean canDelete(String bucket, Authentication auth);
boolean canReport(String bucket, Authentication auth);
}
Each tenant registers a Spring bean named referenceDataSecurity implementing this SPI. The standalone service ships ConfigurableReferenceDataSecurity which reads bucket→authority maps from PROPERTIES so different deployments apply different policies without forking the controller.
UI — one generic admin route. The bookwise-ui (and goldfish-ui mirror) ship /administration/reference-data/:bucket served by ReferenceDataComponent. Adding a new bucket is a 2-line UI change (label + route entry) — no new component, no new service, no new OpenAPI client. The Node proxy route /api/reference-data is already wired with pathRewrite: { '^/api/reference-data': '' }.
Authoring rule. If you're about to build "manage a list of codes for X", do NOT create nucleus-X-type-service or x-types.component.ts. Pick a bucket name, add a migration row, register your auth policy, add the bucket label to the UI. Done. The original per-entity address-types component was deleted in the 2026-04-26 sweep precisely because of this rule.
Deployment ports: BookWise = 11026, GoldFish = 10026, TaskSense = 9026.
Session Tracking & MDC Diagnostic Framework
Observability
Every HTTP request carries diagnostic context from the browser through all backend services into log output. Support teams trace a user's entire session across microservices with a single ID.
How It Works
- Browser generates a
uiSessionIdon first request, persists inlocalStorage - Angular global interceptor attaches
X-Ui-Session-Idheader to every outgoing HTTP request - Backend filters populate SLF4J MDC per request — automatically included in every log line
- User shares their Session ID from the Support dialog (footer → Support → copy button)
- Support searches logs by session ID in the Log Explorer or live stream filter
MDC Filters (nucleus-core)
| Order | Filter | MDC Keys | Source |
|---|---|---|---|
| 1 | SecurityContextFilter | username | Spring Security authentication |
| 2 | ContexPathFilter | remote-ip, context-path | HttpServletRequest |
| 3 | UiSessionMdcFilter | uiSessionId | X-UI-Session-Id header or ui_session_id cookie |
Support Workflow
- User reports an issue → opens Support dialog → copies Session ID
- Support opens Log Explorer → selects service → pastes Session ID in search
- All log entries from that browser session are displayed across all services
- For live debugging: enable remote logging → open Logs tab → paste ID in filter
The session ID is a random diagnostic key — not a JWT or auth token. Safe to share with support.
Field-Level Data Obfuscation
Security
Per-client, per-field encryption framework that allows users to hide sensitive data (transaction descriptions, memos) using AES/GCM. Data is encrypted at rest and decrypted only in-memory when needed for processing.
Encryption
| Property | Value |
|---|---|
| Algorithm | AES/GCM/NoPadding |
| Auth tag | 128-bit |
| IV | 96-bit (random per encryption) |
| Key storage | user_security_key table (Base64, per-user) |
| Encrypted format | ENC:GCM:[Base64(IV+ciphertext)] |
Obfuscation Lifecycle
- User clicks lock icon on a transaction → UI calls
POST /transaction/obfuscate - Service hashes the description, finds ALL matching transactions (batch obfuscation)
- Creates
UserObfuscationRulefor audit trail (entityName + fieldName + hash) - Fetches client encryption key, encrypts description + memo with AES/GCM
- Sets display to "Hidden Transaction" / "Hidden Notes",
obfuscated=true - Unlock reverses: deletes rule, decrypts, restores plaintext
Cross-Service Integration
| Service | Role |
|---|---|
| nucleus-core | Obfuscator — AES/GCM encrypt/decrypt engine |
| user-service | Stores encryption keys + obfuscation rules (audit trail) |
| transaction-service | Performs obfuscation lifecycle, stores encrypted fields |
| categorizer-service | Decrypts in-memory for AI classification (never logged) |
Obfuscation Rules API
| Endpoint | Description |
|---|---|
GET /user/{clientId}/obfuscation-rules | List all rules for a client |
POST /user/{clientId}/obfuscation-rules | Create rule (entityName, fieldName, literalHash) |
GET /user/{clientId}/obfuscation-rules/exists | Check if rule exists |
DELETE /user/{clientId}/obfuscation-rules/{ruleId} | Delete rule |
Security Guarantees
- Client keys never leave the server (service-to-service JWT calls)
- Decrypted values exist only in-memory, never logged (PII framework enforced)
- Each client has their own key — cross-client decryption impossible
- AES/GCM provides confidentiality + integrity (tampering detected)
- Hash-based matching enables batch operations without decrypting to search
PII-Safe Logging Framework
Security
All services use a PII-safe logging framework that automatically sanitizes sensitive data before it reaches log files, monitoring systems, or log aggregators. This prevents accidental exposure of personal data.
Components
| Component | Purpose |
|---|---|
@CustomLog | Lombok annotation replacing @Slf4j. Uses MaskingLoggerFactory for automatic PII sanitization. |
PiiArgs.keyValue(key, obj) | Structured argument for explicit sanitization when logging objects. Returns logstash-compatible StructuredArgument. |
PiiSanitizer | Recursively sanitizes POJOs, collections, maps. Handles cycles and nested objects. |
@Pii | Field-level annotation declaring inline masking strategy (highest priority). |
Configuration
Each service declares PII rules in src/main/resources/application-pii.yml:
default-strategy: NONE
fields:
email: PARTIAL
password: FULL
token: FULL
jwt: FULL
ssn: FULL
creditCard: HASH
class-fields:
com.example.UserDto.phoneNumber: PARTIAL
com.example.PaymentDto.cardNumber: HASH
Masking Strategies
| Strategy | Output | Use for |
|---|---|---|
NONE | Value as-is | Non-sensitive fields |
PARTIAL | d***n@email.com | Fields needing partial visibility (email, phone) |
FULL | ******** | Secrets (passwords, tokens, SSN) |
HASH | a1b2c3d4 | Correlation without exposure (IDs, card numbers) |
Resolution Order
@Piiannotation on the field (highest priority)class-fieldsinapplication-pii.yml(FQCN.fieldName)fieldsinapplication-pii.yml(field name only)default-strategyinapplication-pii.yml(fallback)
Setup
1. Add lombok.config to each module:
lombok.log.custom.declaration = org.slf4j.Logger \
com.nucleus.core.logger.MaskingLoggerFactory.getLogger(TYPE)
2. Replace @Slf4j with @CustomLog on all classes.
3. Use PiiArgs.keyValue() when logging objects:
log.info("Processing {}", PiiArgs.keyValue("request", payload));
Dependency Map
nucleus-parent --> nucleus-bom
nucleus-bom --> All modules (version management)
nucleus-core --> nucleus-connectors
--> nucleus-storage
--> nucleus-audit-core
--> nucleus-authentication
--> nucleus-execution-events
nucleus-connectors --> nucleus-monitoring-client
--> nucleus-ui-message-broker
nucleus-mail-common --> nucleus-mail
--> nucleus-mail-renderer
--> nucleus-mail-sender
nucleus-audit-core --> All application services (transitive)
Configuration Properties
| Property | Default | Module |
|---|---|---|
kafka.service.registry.topic.name | bookwise.service.registry | monitoring-client |
kafka.remote.logging.topic.name | bookwise.remote.logs | monitoring-client |
monitoring.heartbeat.interval-ms | 30000 | monitoring-client |
spring.config.import | -- | All services |
spring.profiles.active | -- | All services |
Tech Stack
| Layer | Technology |
|---|---|
| Backend | Java 17, Spring Boot 3.5, Spring Security, Spring Cloud Config |
| Messaging | Pluggable via @NucleusListener: Kafka, RabbitMQ, SQS, SNS, Kinesis, Google Pub/Sub, Azure Service Bus, Azure Queue Storage, ActiveMQ, Artemis, Solace (JMS + native), IBM MQ |
| AI | Provider-agnostic AiClient: OpenAI, Anthropic (extensible) |
| Database | Spring Data JPA / Hibernate (database-agnostic: MySQL, PostgreSQL, Oracle, SQL Server) |
| Storage | MinIO (S3-compatible object storage) |
| Real-time | Server-Sent Events (SSE) with role-based delivery |
| Testing | Spock 2.4 / Groovy 4.x, JaCoCo coverage, 866+ tests |
| CI/CD | Jenkins pipelines, Artifactory, Bitbucket webhooks |
| API | OpenAPI 3.0 with generated TypeScript + Java WebClient clients |