Overview and Workflow
neutrino is implemented with three major components:
- Hook Driver (
neutrino/src/): Providing runtime support for assembly tracking, code caching, implemented in C and injected into client programs. - Probe Engine (
neutrino/probe/): Instrumenting probes based on assembly probes (.toml), implemented in Python and exposed as a CLI tool. - DSL Compiler (
neutrino/language/): Compile the platform-independent Tracing DSL (.py) into low-level assembly probes (.toml), implemented in Python and exposed as a CLI tool.
These modules are integrated together with fork/exec to expose a simple command-line interface similar to bpftrace and valgrind:
The basic workflow are broken down into following steps:
- CLI Entry will load the probe in
-p/--probeoption. - If probe is of DSL (
.py), DSL Compiler will be invoked to compile, and verify into assembly probes (.asm) wrapped in TOML. - CLI Entry will
forka subprocess toexecthe workload (python main.py) and inject the hook driver viaLD_PRELOAD. - Hook Driver will continuously capture the GPU workload,
particularly the GPU kernels launched. For each kernel:
- Hook driver
forka subprocess toexecthe probe engine. - Probe engine will objdump, probe, and reassemble the kernel.
- Hook driver
waitfor probe engine, and load the probed kernel (and metadata) back. - Hook driver
mallocthe probe maps on device and host, then launch the kernel and syncrhonize. - After synchronization, hook driver
memcpythe probe maps from device to host, thenfwriteto file system.
- Hook driver