Introduction

High-speed volumetric imaging is an indispensable tool for investigating dynamic biological processes1. Traditional scanning-based 3D imaging techniques such as confocal microscopy2, two-photon microscopy3 and light-sheet microscopy4 offer high spatial resolution. However, their data acquisition speeds are often constrained by the need for beam scanning. Consequently, these techniques suffer from an inherent trade-off between temporal resolution and the space-bandwidth product (SBP), measured by the ratio of the 3D field-of-view (FOV) to the spatial resolution.

Single-shot 3D widefield imaging techniques circumvent this trade-off by computational imaging. These methods first encode 3D information into 2D multiplexed measurements and then reconstruct the 3D volume computationally. Examples of such techniques include light field microscopy (LFM)5,6,7,8,9, lensless imaging10,11,12, and point-spread-function (PSF) engineering13,14. Conventional LFM works by inserting a microlens array (MLA) into the native image plane of a widefield microscope5. Each microlens captures unique spatial and angular information from a sample, allowing for subsequent computational 3D reconstructions without scanning. However, this configuration suffers from intrinsic limitations, such as inconsistent spatial resolution due to uneven spatial sampling across the MLA. Recently, Fourier LFM has emerged as a solution to alleviate the limitations of the conventional LFM by positioning the MLA at the Fourier or pupil plane of a microscope6,7. By recording the light field information at the Fourier domain, Fourier LFM ensures uniform angular sampling, which allows the technique to achieve a consistently high spatial resolution across the recovered volume. Despite these advancements, the synchronous readout constraints of traditional CMOS cameras remain a fundamental bottleneck for single-shot 3D wide-field techniques. Although it is possible to increase the frame rate by restricting the readout to only a specific region of interest (ROI) from the CMOS sensor, this inevitably results in a reduced SBP. As a result, existing LFM techniques typically operated below 100 Hz at the full-frame resolution. This constraint hinders their applications in capturing ultrafast dynamic biological processes that may exceed kilohertz (kHz), such as voltage signals in mammalian brains15, blood flow dynamics16 and muscle tissue contraction17, thus leaving a significant technological gap yet to be bridged.

To address these technological limitations, ultrafast imaging strategies have emerged, showing promise in various applications, such as characterization of ultrafast optical phenomena18,19, fluorescence lifetime imaging20,21, non-line-of-sight imaging22, voltage imaging in mouse brains23,24,25, and neurovascular dynamics recording26. Despite their merits, these techniques often come with their own trade-offs, such as the need for high-power illumination, which can be phototoxic to live biological specimens, and the need for expensive, specialized, and complicated optical systems.

Recently, event cameras have garnered significant attention over the past decade for their capability to provide kHz or higher frame rates while offering flexible integration into various platforms27. Unlike traditional CMOS cameras that record information from the full frame synchronously and at set time intervals, the event camera employs an asynchronous readout architecture. Each pixel on the event camera independently generates “event” readouts based on the changes in the pixel-level brightness over time. Each event recording contains information about the pixel’s position, timestamp, and polarity of the brightness change, allowing for ultra-high temporal resolution and reduced latency28. As a result, event camera enables recording of ultrafast signal changes at speeds exceeding 10 kHz limited only by pixel latency. In addition, event cameras are implicitly sensitive to changes in the logarithm of the photocurrent, allowing them to achieve high dynamic range that exceeds 120 dB29. Furthermore, their asynchronous operation and sensitivity to only brightness changes significantly reduce data redundancy and data load, easing the memory and transmission requirements for long-term dynamic recordings, which is a limit for traditional high-speed cameras30. (A detailed comparison of event cameras and traditional high-speed cameras is provided in Section 12 of Supplement 1). Leveraging these unique attributes, event cameras have shown promise across diverse applications, ranging from self-driving31 and gesture recognition32 to single-molecule localization microscopy33,34.

In this work, we introduce EventLFM, a novel ultrafast, single-shot 3D imaging technique that integrates an event camera into a Fourier LFM system, as illustrated in Fig. 1a. We develop a simple event-driven LFM reconstruction algorithm to reliably reconstruct 3D dynamics from EventLFM’s spatiotemporal measurements. We experimentally demonstrate the applicability of EventLFM for 3D fluorescence imaging on various samples, achieving speeds of up to 1 kHz, effectively bridging the existing technological gaps in capturing ultrafast dynamic 3D processes.

Fig. 1: Overview of EventLFM.
figure 1

a Schematics of the setup. b Space-time event spike stream captured by the event camera, where blue and red indicate negative and positive intensity changes at individual pixels, respectively. c Frames are generated from the raw event stream using the time surface algorithm within a 1 ms accumulation time interval. d Each time-surface frame is reconstructed using a light-field refocusing algorithm with an optional deep learning enhancement step. e Depth color-coded 3D light field reconstruction of the object. f An optional convolutional neural network (CNN) refines the refocused volume, enhancing the 3D resolution and suppressing the noise artifacts. g Time color-coded 3D motion trajectory reconstructed over a 45 ms time span. h Illustrative experiment results on rapidly blinking neurons in a 75 µm thick, weakly scattering brain section. Temporal traces capturing dynamic neuronal activities are extracted from reconstructed frames with an effective frame rate of 1 kHz

To elucidate the method, Fig. 1a shows an example involving a suspension of fluorescent beads that traverse various trajectories within a 3D space. EventLFM captures a stream of events, as depicted in Fig. 1b, which arise from instantaneous changes in intensity due to the rapid displacement of these beads across the FOV. Our event-driven LFM reconstruction algorithm works by first converting the acquired events into “conventional” 2D frame-based representations. This conversion is performed through a time-surface based method that leverages both spatial and temporal correlations among events over a predefined temporal-spatial window35,36. The algorithm assigns values to each pixel based on accumulated historical data, which is shown in Fig. 1c. Subsequently, these generated time-surface maps are cropped into 5 ×5 views and undergo a 3D reconstruction via the light field refocusing algorithm37, as illustrated in Fig. 1d. This refocused volume can also be enhanced by a deep learning module to improve the 3D resolution and suppress the noise artifacts, as illustrated in Fig. 1f. A representative frame depicting the 3D reconstruction of the fluorescent beads is provided in Fig. 1e displaying a depth color-coded map. Finally, to encapsulate the entire 4D information, a spatiotemporal reconstruction is visualized. This entails performing EventLFM reconstruction from the event measurements with an equivalent 1 kHz frame rate, spanned across a 45 ms time window. Figure 1g shows the recovered 3D trajectories of the fast-moving fluorescent beads with a depth color-coded map encoding the temporal information.

We provide a quantitative evaluation of EventLFM’s 4D imaging capabilities across a range of fast dynamic samples. This includes fast-moving fluorescent beads subjected to both controlled and random 3D motions, as well as rapid blinking beads that operate at frequencies up to 1 kHz, both with and without 3D motions. In addition, we showcase a demonstrative experiment on the imaging of rapidly blinking neuronal signals simulated through either uniform or targeted illumination in a 75 µm thick mouse brain section, as illustrated in Fig. 1h, and demonstrate EventLFM’s ability to significantly enhance signal contrast in scattering tissues by rejecting the temporally slowly varying background in the raw event measurements. Furthermore, we demonstrate 3D tracking of labeled neurons in multiple freely moving C. elegans. Our results collectively demonstrate the robust and ultrafast 3D imaging capabilities of EventLFM, thereby underscoring its potential for elucidating intricate 3D dynamical phenomena within biological systems.

Results

System characterization

To validate the fidelity of the result of EventLFM, we conduct a comparative study with a standard Fourier LFM equipped with an sCMOS camera. First, we calibrate the 3D PSF for both systems. Given that the event camera can only capture dynamic objects, EventLFM PSFs are obtained from an event stream generated by a bead translating continuously through the system’s depth of field (DOF) at 0.2 mm s−1, as illustrated in Fig. 2a. For the standard Fourier LFM, PSFs are obtained by scanning a 1 μm bead along the z-axis. Subsequently, we analyze the lateral and axial resolutions by computing the 3D modulation transfer function (MTF) for both systems, defined by the 3D Fourier spectrum of the calibrated 3D PSF8. Figure 2b shows a strong agreement in both the 3D PSFs and the 3D MTFs between EventLFM and the standard Fourier LFM, thereby validating EventLFM’s ability to achieve high spatial resolution at a markedly improved frame rate (1000 Hz vs. 30 Hz). We note that the PSF measurements from the event camera are noisier than that from the sCMOS camera due to more pronounced noise from the measured event stream. Additional performance metrics of standard Fourier LFM, such as FOV, DOF, and resolution, are elaborated in Section 1 of Supplement 1.

Fig. 2: Characterization of the EventLFM system.
figure 2

a 3D PSF acquisition pipeline. The event camera captures the vertical motion of a single bead within the DOF. This raw event stream is subsequently converted into discrete frames using the time-surface algorithm. b Comparisons of MIPs of the 3D PSFs and 3D MTFs for both the EventLFM and a benchmark Fourier LFM using an sCMOS camera. c 3D reconstruction comparison of fluorescent beads distributed volumetrically, obtained using EventLFM and the benchmark Fourier LFM. d Axial resolution assessment based on profiles of a single bead from 3D reconstructions obtained with both the event camera (in blue) and the sCMOS camera (in green), demonstrating that both systems offer comparable axial resolution. e MIPs of the reconstructed 3D volumes from both LFM methods. For additional validation, a reference MIP is obtained through axial scanning of the same object using conventional widefield fluorescence microscopy. f Accumulation time sensitivity is analyzed by plotting the MIPs of EventLFM reconstructions at different time intervals. g Effect of accumulation time on axial elongation is assessed by plotting the axial intensity profiles of a bead as marked in (f), revealing an increase in axial elongation correlated with longer accumulation times. h Illumination power analysis is performed by visualizing the MIPs of EventLFM reconstructions for the same object but under varying LED power levels. i Correlation between the mean intensity of a reconstructed bead (as indicated in (h)) and the LED power levels

For an intuitive, side-by-side comparison, we simultaneously acquire data from a slowly moving 3D fluorescent beads phantom using both systems. Both datasets - comprising time-surface frames from EventLFM and sCMOS frames - are processed via the same light field refocusing algorithm to generate 3D reconstructions. Figure 2c shows single-frame depth color-coded 3D reconstructions from both systems. The consistency between the two methods verifies EventLFM’s fidelity in reconstructing depth information throughout the DOF. To further confirm that EventLFM provides consistent axial resolution, intensity profiles extracted from the same bead along the white dashed lines are compared in Fig. 2d. For further validation, conventional widefield fluorescence microscopy (Plan Apo, 20X, 0.75 NA, Nikon) is also employed to capture a z-stack of the same phantom (See details in Section 1 of Supplement 1). A comparison of depth color-coded maximum intensity projections (MIPs) across all three methods is shown in Fig. 2e. The results confirm EventLFM’s capability for accurate volumetric reconstruction across the entire FOV. Intriguingly, we observe that the axial elongation achieved by EventLFM is slightly shorter than that achieved by the standard Fourier LFM, as evidenced in both the 3D reconstructions (Fig. 2c) and axial profiles of individual beads (Fig. 2d). We attribute this observation to the unique event-driven signal acquisition mechanism of the event camera. Specifically, an accumulation time of 1 ms necessitates sufficient power to trigger events. When the illumination power is low, only the central region of the beads has adequate intensity to generate such events, which in turn reduces the axial elongation in the reconstructions.

We also characterize how EventLFM’s performance is affected by key experimental parameters, specifically the accumulation time of the event camera and the illumination power. The raw event stream from the event camera exhibits a temporal resolution of 1 μs. When this data is transformed into frames, the user-defined accumulation time significantly influences the quality of the reconstruction. To demonstrate this, we image a fluorescent beads phantom moving at 2.5 mm s−1 along the y direction. Similar to conventional cameras, an elongated accumulation time leads to increased averaged intensity and enlarged/blurred bead images, as shown in Fig. 2f and axial profiles in Fig. 2g. By properly selecting an appropriate accumulation time based on sample’s brightness levels and event dynamics, the event camera can achieve superior resolution (See more discussion in Section 1 of Supplement 1). It should be noted that the accumulation time is only adjusted in the post-processing step without impacting the data capture speed (Detailed discussion about accumulation time is provided in Section 11 of Supplement 1). Next, while the event stream inherently lacks information on absolute intensity, we observe its sensitivity to variations in object brightness levels, as shown in Fig. 2h. Intuitively, this is because a larger intensity variation produces more events in quick succession. To demonstrate this, we record the same fluorescent beads phantom moving at 1 mm s−1 along the y direction under varying illumination powers, spanning 1.8 mW mm−2 to 8.1 mW mm−2, while keeping the accumulation time constant. The subsequent EventLFM reconstructions reveal a positive correlation between reconstructed intensity and illumination power, as depicted in Fig. 2i.

Imaging of fast-moving objects

We substantiate the capability of EventLFM to reconstruct high-speed 3D motion, demonstrating its utility in capturing dynamical phenomena in biological contexts. First, we employ a motorized stage with a velocity range of 0.001 mm s−1 to 2.7 mm s−1 to execute controlled motion experiments. We image a 3D phantom comprising 2 μm fluorescent beads moving at 2.5 mm s−1. EventLFM successfully reconstructs the rapid movements across all depths at an effective frame rate of 1 kHz, as illustrated in Fig. 3a. Furthermore, to evaluate the imaging performance of EventLFM across various velocities, we extract an ROI and present consecutive frames in Fig. 3b, c. Figure 3b shows the accelerating process from 0.1 mm s−1 to 2.5 mm s−1 with acceleration of 2.5 mm s−2. The trajectory’s slope (dashed white curve) steepens over time, reflecting the increased speed. Figure 3c provides consecutive frames within the same ROI at the peak velocity of 2.5 mm s−1. Given the object’s fixed velocity along the x-axis, the bead positions calculated from the motorized stage setting align well with the EventLFM reconstructions. In contrast, the standard Fourier LFM using the sCMOS camera operating at 30 Hz suffers from severe motion blur artifacts. We image the same object using the benchmark Fourier LFM system under static and slow-motion conditions (See details in Section 5 of Supplement 1).

Fig. 3: EventLFM imaging of fast-moving objects.
figure 3

ad Results from objects with directional movement. a Depth color-coded MIP of a single reconstructed frame capturing a phantom comprising fluorescent beads. The beads move in a single direction, as denoted by the arrow, at a calibrated speed. b Zoom-in MIPs from 32 consecutive frames, revealing the unidirectional, continuous accelerating motion of the beads over time. The shadowed traces on the projections highlight the beads’ motion trajectory over time, with an increasing slope indicative of accelerated movement. c Zoom-in MIPs from 6 consecutive frames, revealing the unidirectional, uniform motion of the beads on a millisecond scale at 2.5 mm s−1 over time. d Lateral and axial resolutions across varying velocities are accessed through the FWHM of a single bead in the reconstruction at different frames. The 3D resolutions remain stable across speeds ranging from 0.1 mm s−1 to 2.5 mm s−1, with the blue line indicating the average resolution, and the shaded area representing the standard deviation. e, f Results from objects with random motions. e Depth color-coded MIP of a reconstructed frame showcasing fluorescent beads undergoing randomly movements in a liquid solution (see video of the moving beads in Visualization 1). f Zoom-in 3D volume renderings detail the random 3D trajectories of moving beads, as denoted by the dotted lines

Additionally, the system’s lateral and axial resolution are quantitatively evaluated at various speeds by calculating the FWHM of a reconstructed single bead, as depicted in Fig. 3d. The 3D resolutions are consistent across a velocity spectrum from 0.1 mm s−1 to 2.5 mm s−1, demonstrating the EventLFM system’s capability to reliably image objects in motion at differing velocities without resolution degradation, which further corroborate the robustness of our EventLFM system. These controlled experiments confirm EventLFM’s efficacy in capturing rapid 3D dynamics at frame rates up to 1 kHz.

In the context of biological applications, the motion of many samples occurs over a gamut of velocities, directions, and depths. Acknowledging this complexity, we extend our EventLFM evaluations to scenarios involving uncontrolled complex 3D motion. Specifically, fluorescent beads are suspended in an alcohol-water droplet subjected to ultrasonic disintegration, inducing variable motion directions and velocities exceeding 2.5 mm s−1. After performing EventLFM reconstructions, we present a depth color-coded MIP in Fig. 3e. A selected sub-region, marked by a white dashed square, is subjected to volumetric rendering for 5 representative frames in Fig. 3f. Leveraging the millisecond-level temporal resolution, we trace intricate trajectories (depicted as dotted lines with arrows) for individual beads. Notably, complex motion patterns - including depth fluctuations - are faithfully captured. For instance, a bead represented in blue in the frame labeled #00 in Fig. 3f exhibits helical movement through the volume over several microseconds. This affirms EventLFM’s utility in characterizing complex, high-speed 3D dynamics.

Imaging of dynamic blinking objects

In addition to capturing rapid motions, another important category of complex and dynamic biological processes entail rapidly blinking signals, such as those arising from neuronal activities. To evaluate EventLFM’s capability of tracking these types of dynamic signals, we employ a high-power LED driver (DC2200, Thorlabs) to generate adjustable pulsed illumination. In this proof-of-concept study, a 3D phantom embedded with fluorescent beads is illuminated using a variable pulse sequence, configured with a 1 ms pulse width and a variable pulse repetition rate ranging from 2 ms to 50 ms.

It should be noted that the event-based signal features from blinking objects differ from those of fast-moving objects. Thus, an additional system characterization tailored to blinking objects is carried out. In this experiment, we maintain a pulse width and accumulation time of 1 ms width, while the optical power when the LED is on is systematically altered between 1.8 mW mm−2 and 8.1 mW mm−2. As shown in Fig. 4a, the reconstructed signal increases with the applied optical power. We further quantify this relationship by isolating a single bead and calculating its mean reconstructed intensity at various illumination powers; the resultant graph presented in Fig. 4b reveals an approximately linear relationship.

Fig. 4: EventLFM imaging of dynamic blinking objects.
figure 4

a Depth color-coded MIPs of single-frame reconstructions capturing a blinking object under variable LED illumination powers. The LED pulse width is set to 1 ms with an inter-pulse interval of 2 ms. b Quantitative analysis of the reconstructed bead intensity as a function of LED power. Intensity measurements are averaged over a region within a single bead, as indicated by an arrow in (a). c Depth color-coded MIP of a single frame extracted from the reconstructed 3D volume, showcasing blinking fluorescent beads. d Temporal trace analysis is performed by calculating the mean intensities of three distinct beads, labeled as 1, 2, and 3, from the reconstructed volume. The LED pulse widths are uniformly set at 1 ms, while the inter-pulse intervals vary randomly between 2 ms and 50 ms. e Depth color-coded MIP of a single frame from the reconstructed moving phantom embedded with blinking fluorescent beads. f A temporal sequence of depth color-coded MIPs from the white dashed rectangular region in (e) demonstrates synchronization with the LED pulse sequence shown below. The reconstructed bead positions are in agreement with the input linear motion

Next, we demonstrate EventLFM’s ability to image 3D objects blink at disparate intervals. For post-processing, a 1 ms accumulation time (equivalent to a 1 kHz frame rate) is set, synchronized to the onset of the first pulse. By employing the light field refocusing algorithm, we successfully reconstruct the blinking beads as displayed in Fig. 4c. To further validate the system’s accuracy, three distinct beads (as marked in the MIP in Fig. 4c) are selected and their mean intensity signals calculated, as shown in Fig. 4d. The temporal traces confirm that the reconstructed signals, despite slight fluctuations in intensities, are in agreement with the pre-configured LED pulse sequences. This result validates EventLFM’s capability in capturing high-frequency blinking signals in a 3D spatial context.

To provide a comprehensive assessment, we introduce concurrent linear motion to the blinking objects by synchronizing pulsed illumination with translational movement of the 3D phantom via a motorized stage. A phantom embedded with fluorescent beads is used similar to earlier experiments. Parameters are also set similar to earlier experiments, with a pulse width of 1 ms and a 2 ms delay, while the object velocity is fixed at 2.5 mm s−1. Figure 4e shows a depth color-coded MIP from a single reconstructed volume frame. To elucidate the dynamic objects further, Fig. 4f illustrates 16 consecutive frames within the white dashed rectangular region indicated in Fig. 4e. These frames clearly show the expected linear motion, and the blinking events are reconstructed at the expected timestamps. Each reconstructed bead is translated linearly along the y-axis, as expected. Each signal-bearing frame is followed by two empty frames, which conform to the set LED pulse sequences shown in the bottom panel of Fig. 4f. These results confirm EventLFM’s robust and reliable performance in capturing complex 3D dynamics.

Imaging of neuronal signals in scattering mouse brain tissue

To demonstrate EventLFM’s potential for neural imaging, we image a 75 µm thick section of GFP-labeled mouse brain tissue. Initially, the sample is uniformly illuminated using a pulsed LED source with a 1 ms pulse width. To validate the spatial reconstruction accuracy of EventLFM, we capture the fluorescence signals with traditional Fourier LFM and conventional fluorescent microscopy under constant illumination. Figure 5a shows MIPs from a single reconstructed frame of each method. By visual inspection, the reconstruction from EventLFM is consistent with Fourier LFM, effectively capturing all neurons within the FOV and the intensity variations among them. However, a notable difference arises in the signal-to-background ratio (SBR). Fourier LFM suffers from a low SBR due to tissue scattering, which results in neuronal signals being buried in strong background fluorescence. In contrast, EventLFM demonstrates a significantly improved SBR, yielding a reconstruction with markedly improved image contrast and suppressed background fluorescence. This improvement is attributed to the event-based measurement mechanism intrinsic to EventLFM, wherein a readout is generated only when intensity changes exceed a certain threshold. Consequently, temporally slowly varying background fluorescence signals, which do not often meet this criterion, are either removed or substantially reduced in the raw data.

Fig. 5: EventLFM imaging of neuronal activities in mouse brain tissue.
figure 5

a, b Spatiotemporal MIP comparison between axial scanned widefield microscopy, Fourier LFM, and EventLFM under uniform and targeted illumination. c Simulated neuronal activities within mouse brain tissue across various frequencies and time constants using a sequence of pre-modulated DMD patterns. Black temporal traces illustrate the stimulus from DMD-modulated illumination targeting twelve distinct neurons labeled in (a, b), while colored temporal traces represent the recorded neuronal signals, derived from averaging the intensities for each neuron. d Normalized cross-correlations of the neuronal traces extracted from EventLFM recording and their sorted ground-truth stimuli. e Averaged intensity (line) and standard error (shaded area) across the spike train for each neuron, the averaged pulse width accurately matching the intensity-normalized stimulus (dotted line)

Given that neuronal activity is often characterized by temporal intensity variations with minimal spatial movements, we implemented a targeted illumination strategy with pre-designed DMD patterns to further reduce the background noise38. These patterns modulate the excitation light to selectively illuminate the neurons, thereby mitigating the undesired background fluorescence excited from other sample regions. Moreover, to simulate realistic complex neuronal activities15,25,39, we modulate a series of DMD patterns with various widths ranging from 1 ms to 6 ms and intervals from 100 ms to 600 ms, generating unique spike trains for each neuron. Initially, to illustrate targeted illumination’s efficiency in background suppression, Fig. 5b provides the spatiotemporal MIPs of the time-series measurements from the widefield microscopy, Fourier LFM, and EventLFM with targeted illumination. By employing pre-designed DMD patterns (labeled by white dashed lines), only neurons are selectively illuminated within the brain slice, therefore providing a much cleaner background compared with the corresponding MIPs from the uniform illumination shown in Fig. 5a. When comparing the MIPs obtained from EventLFM and Fourier LFM, EventLFM again demonstrates improved signal contrast and background suppression capability by leveraging its event-driven measurements.

Additionally, to highlight EventLFM’s capability in capturing rapid neuronal activities, Fig. 5c presents temporal traces from 12 distinct neurons, which are closely aligned with the pre-modulated DMD stimulus. Moreover, Fig. 5d presents a quantitative evaluation through the normalized correlation coefficient between neuronal signals extracted from EventLFM and the corresponding sorted ground-truth stimulus over a time duration spanning 2.4 s. The diagonal elements reaching a value of ‘1’ underscore the strong correlation between temporal traces extracted from EventLFM and their respective ground truths. Non-diagonal elements reveal the presence of harmonic frequencies in the stimulus and potential signal crosstalk among closely situated neurons. These alignments and strong correlations validate EventLFM’s capability to accurately reconstruct neuronal blinking dynamics within scattering tissues. Finally, utilizing predefined pulse widths and intervals for the spike trains of 12 neurons, we determine the temporal locations and extract all spikes within a 2-s duration from reconstructed traces. Figure 5e presents the averaged intensity and standard error of the extracted single spikes for all 12 neurons. For validation, these spikes are compared with the ground truth stimulus, where the amplitude is defined by the averaged fluorescence intensity for each neuron. The pulse width derived from EventLFM reconstruction matches the stimulus precisely, demonstrating EventLFM’s capability to robustly and accurately capture the unique footprint for complex neuronal dynamics. Additional details including DMD pattern generation, denoising of the event stream from the brain slice measurements, 3D reconstruction results of the brain slice, and additional experimental results with pulsed illumination can be found in Section 8 and Section 9 of Supplement 1.

Imaging of neuron-labeled freely moving C. elegans

To further showcase EventLFM’s ability to capture complex biological dynamics, we employ it to track GFP-labeled neurons in a sample containing multiple C. elegans. For the experiment, the C. elegans are positioned on a gel substrate and subsequently submerged in a droplet of S-Basal solution, thereby creating a 3D environment for their free movement. First, we identify four distinct GFP-expressing neurons using conventional fluorescent microscopy - two located in the tail region and another two in the mid-body section, as visualized in Fig. 6a. Despite the relative sparsity of neurons, multiple C. elegans specimens are placed within the FOV. To accumulate enough event data for weaker neuronal signals, we set the accumulation time at 2 ms, yielding an effective frame rate of 500 Hz, which is sufficient for real-time 3D tracking of the organism. Using our EventLFM reconstruction algorithm, we generate depth color-coded MIP of the reconstructed volume frame at time 0 ms in Fig. 6b, which clearly shows the spatial distribution of neurons across different depths for four distinct C. elegans. To further extract the neuronal dynamics, we focus on a specific region marked by a white dashed rectangle in Fig. 6b. Two temporally separated 3D reconstructions from this region are presented at timestamps 42 ms and 92 ms in Fig. 6c, complete with tracked trajectories marked in dashed lines. To further examine the neuronal movements, we present a time-series montage of the aforementioned area in Fig. 6d (Additional results and comparisons with standard Fourier LFM are shown in Section 7 of Supplement 1). Notably, the neurons displayed in blue exhibit rapid and ascending motion across multiple axial planes over the time course. These result showcase EventLFM’s capability to accurately capture biological dynamics in a 3D space at ultra-high frame rates.

Fig. 6: EventLFM imaging of freely moving C. elegans.
figure 6

a Representative image of C. elegans specimen captured via conventional widefield fluorescence microscopy, with the organism’s contour delineated by the dashed line. b MIP of a depth color-coded reconstructed volume. c Time-resolved 3D reconstructions extracted from the white dashed rectangle region in (b) at timestamps 42 ms and 92 ms. The tracked trajectories over a 106 ms sequence are shown in the dashed lines. d Sequential MIPs extracted along the white dashed rectangle in (b), spanning a duration of 106 ms, reveal the 3D motions of neurons within the C. elegans organism

Reconstruction results with deep learning

While the light-field refocusing algorithm is efficient, straightforward, and robust to various samples, it is susceptible to ghost artifacts and axial elongations in the 3D reconstructions. To address these issues, we implemented a convolutional neural network (CNN), modified from our previously developed CM2Net9, tailored for high-resolution volumetric reconstruction in imaging systems with multi-view geometry. This network is trained with an experimentally collected dataset containing 3D moving particle phantoms. Event streams from these phantoms are integrated to generate the time surface frame at 1 ms intervals, which are then cropped to 5 × 5 view stacks, and generated refocused volumes (RFVs) as inputs for network reconstruction. Corresponding ground-truth volumes are established through axial scans from conventional wide-field microscopy (20X, 0.75 NA).

Initially, the trained network is tested on a fast-moving 3D phantom containing 2 µm fluorescent particles. Figure 7a illustrates a depth color-coded comparison of CNN reconstruction, RFV, and ground-truth volume, revealing that both CNN and RFV accurately recover the 3D information, with the CNN offering enhanced 3D resolution and suppressing the refocusing artifacts compared to RFV. The improvement in 3D resolution is quantified by comparing the x and z profiles of a single bead, shown in Fig. 7b, where the CNN’s profile aligns closely with the widefield system, demonstrating a ~ 2x enhancement over the RFV. This resolution enhancement is attributed to the network’s ability to leverage subpixel parallax shifts across various MLA views, allowing for additional spatial information not accessible through conventional single-lens views. To further demonstrate the network’s ability to accurately capture continuous motion, Fig. 7c presents a time color-coded 3D reconstruction and ground truth volume spanning 100 ms. The motion is further visualized by extracting three representative depths at the top, center, and bottom planes within the 3D volumes. The trajectories of particle motions (colored straight line), sampled at 1 kilohertz rate, verify the network’s ability to robustly track the fast-moving objects with high 3D resolution.

Fig. 7: Deep learning reconstruction results for EventLFM.
figure 7

a Comparison between CNN reconstruction, RFV, and the ground-truth obtained via wide field microscopy with axial scanning (20 × 0.75 NA). b Axial and lateral profile of a single bead, as indicated by the white square in (a). The CNN reconstruction (red line) improves both axial and lateral resolution compared to RFV (green line) and matches the particle size as validated by the wide field system. c Visualization of scanned wide field ground-truth volume and 3D reconstructions across 100 sequential frames with time-coded by color. Three representative depths at the top, center, and bottom within the 3D volume are extracted for close validation. The trajectories of rapidly moving particles, sampled at 1 kHz rate, are delineated by straight lines. d Depth color-coded 3D CNN reconstruction of a frame marked as Frame #1. e Temporal trace analysis for three particles, labeled 1, 2, and 3, within the 3D reconstruction. The LED pulse widths are uniformly set at 1 ms with the inter-pulse intervals randomly varying between 2 ms and 50 ms

Finally, we conduct an experiment to evaluate the network’s performance on dynamic blinking objects. Figure 7d presents a depth-color-coded CNN reconstruction of a single frame, demonstrating the high 3D resolution capability in such conditions. Despite the differing event characteristics between blinking and moving objects, employing the RFV as an initial estimation allows the network to effectively generalize across both types of samples. Additionally, Fig. 7e illustrates the temporal traces of three particles within the reconstruction, which precisely align with the LED illumination pulse sequence. This demonstrates the network’s ability to accurately capture high-frequency blinking signals in a 3D context at enhanced resolutions using the CNN.

These preliminary reconstruction results on both fast-moving and dynamic-blinking phantoms demonstrate that EventLFM augmented by deep learning enables ultrafast 3D imaging with high 3D resolution and minimal reconstruction artifacts.

Discussion

We present the first, to the best of our knowledge, ultrafast Fourier LFM system, EventLFM, that leverages an event camera and a tailored reconstruction algorithm to facilitate volumetric imaging at kHz speeds. By comparing the PSF, MTF, and 3D reconstructions, we have established that EventLFM achieves a lateral resolution comparable to that of traditional Fourier LFM system. Notably, EventLFM provides marginally superior axial resolution and substantially improved temporal resolution. Our experimental results further underscore EventLFM’s versatility and capability. We demonstrate its effectiveness in reconstructing complex dynamics of rapidly moving 3D objects at 1 kHz temporal resolution. Moreover, through controlled illumination experiments, we showcase imaging of high-frequency 3D blinking objects with pulse widths as short as 1 ms. Additionally, we demonstrate EventLFM’s ability to capture rapid dynamic signals within scattering tissues by imaging realistic neuronal activities in a mouse brain section simulated by a series of DMD patterns to induce unique spatiotemporal footprints at kHz rates. Moreover, we present imaging and tracking of GFP-expressing neurons in freely moving C. elegans within a 3D space, achieving a frame rate of 500 Hz. Lastly, we show the integration of a deep learning reconstruction network with EventLFM to improve the imaging quality and enhance the 3D resolution.

EventLFM, leveraging its unique capability to capture brightness change, augmented by targeted illumination, significantly mitigates the low SBR challenges typically encountered in scattering environments40 - a major limitation of traditional widefield microscopy techniques - as demonstrated by our experiment on mouse brain tissues. However, the effectiveness of targeted illumination is currently constrained to shallow depth because it relies on a high NA objective to image the DMD patterns onto the sample plane. Future advancements to EventLFM can involve 3D targeted illumination techniques41, potentially extending the SBR enhancement across the entire volume with improved 3D optical sectioning capability. In addition, our work opens tremendous opportunities for future research in event-driven imaging within scattering media42 and the development of advanced computational algorithms that more effectively leverage event-driven measurement for extracting dynamic signals from deep within scattering tissues.

The high sensitivity of event cameras often results in increased noise levels, presenting challenges for their applications in microscopy settings. Our light field refocusing algorithm for EventLFM leverages the uncorrelated noise characteristic across different views under MLA to effectively reduce noise while amplifying the signal of in-focus signals through the shift-and-sum operation, resulting in an SNR improvement by 5x (An additional noise and SNR analysis is provided in Section 4 of Supplement 1). For future developments, there is significant potential for employing advanced denoising algorithms to further suppress stochastic event sensor noise43,44.

Our pilot study of deep learning for EventLFM reconstructions has yielded promising results, notably enhancing 3D spatial resolution and mitigating the ghost artifacts inherent to the refocusing algorithm. Nonetheless, the current training dataset collection process is both time-consuming and suffers from limited data diversity due to the difficulty of acquiring a large-scale ground truth high-speed (e.g., kHz), high-resolution volumes on diverse biological samples. A promising future direction is to develop a physics-based simulator that can efficiently generate a broad range of training data. This will significantly enhance the network’s ability to generalize to more complex biological contexts. Furthermore, the simulator will pave the way for investigating more advanced deep learning techniques, such as employing a demultiplexing network for expanding the imaging FOV45, thus compensating for the event camera’s limited pixel array constraints. Another possible direction involves developing an adaptive encoder to enhance the extraction of physically meaningful information from the sparse event data43,44 to fully exploit the sparsity of event data46 to reconstruct more complex 3D processes over large volumes.

In conclusion, given its simplicity, ultrafast 3D imaging capability, and robustness in scattering environments, EventLFM has the potential to be a valuable tool in various biomedical applications for visualizing complex, dynamic 3D biological phenomena.

Materials and methods

Experimental setup

EventLFM augments a conventional Fourier LFM setup with an event camera (EVK4, Prophesee, IMX636 sensor, 1280 × 720 pixels, 4.86 µm pixel size), as shown in Fig. 1a. A blue LED (SOLIS-470C, Thorlabs) serves as the excitation illumination source for the fluorescent samples. This excitation light is focused onto the back pupil plane of the objective lens (Plan Apo, 20X, 0.75 NA, Nikon) to ensure uniform illumination across the target volume. In addition, we add another branch in the illumination path that includes a DMD (DLI4130 0.7 XGA VIS High-Speed Kit, Digital Light Innovations) for manipulating the spatiotemporal distribution of the excitation light within the FOV. In the detection path, fluorescence emissions from the sample are collected by the objective lens and subsequently relayed to an intermediate image plane by a tube lens (TL, f = 200 mm, ITL200, Thorlabs). This intermediate image is then transformed by a Fourier lens (FL, f = 80 mm, AC508–080-A, Thorlabs). An MLA (f = 16.8 mm, S600-f28, RPC Photonics) is placed at the back focal plane of the FL to achieve uniform angular sampling, thereby generating a 5 × 5 subimage array. An ancillary 4 f relay system (M = 1.25, not shown) is placed after the MLA to ensure optimal distribution of these subimages across the event sensor. In addition, a 50/50 beamsplitter is integrated within the 4 f system to enable simultaneous capture of the dynamic 3D volumes by both the event camera and an sCMOS camera (CS2100M-USB, Thorlabs), thereby providing a direct comparative benchmark between EventLFM and traditional Fourier LFM modalities. Further details can be found in Section 1 of Supplement 1.

Reconstruction algorithm

The event camera records the polarity of changes in pixel intensity as an event stream with a temporal granularity as fine as 1 μs. Specifically, an event is generated when a dynamic or luminous variation within the FOV surpasses a pre-defined threshold. For each event, the sensor outputs the spatial coordinate x and y, the precise timestamp t, and the polarity p (either positive or negative depending on the direction of the intensity change). This asynchronous event stream is then sorted based on the polarity and integrated within a user-defined accumulation time to construct temporally continuous frames, as shown in Fig. 1b. Our sensor has a pixel latency of 100 μs–220 μs, setting the upper limit for accumulation time. To ensure enough events for robust frame reconstruction, we set this time window between 1 to 2 millis based on the specific sample under investigation. Importantly, the chosen accumulation time directly determines the system’s frame rate while also affecting the reconstruction resolution, details of which are elaborated in Section 6 and Section 11 of Supplement 1.

Rather than simply summing the events within the accumulation period, we apply a built-in time-surface algorithm for the post-processing of the raw event data. This algorithm employs an exponential time-decay function to compute a time surface, encapsulating both the spatial and temporal correlations among adjacent pixels. Pixel values in this time surface, ranging from 0 to 255, are indicative of historical temporal activity, as illustrated in Fig. 1c. This approach offers a spatiotemporal representation for each event while mitigating motion blur artifacts (see Section 3 of Supplement 1 for details). In addition, we apply median filtering (medfilt2 in MATLAB) to suppress the sensor noise. Subsequently, these time-surface frames are processed by a standard light field refocusing algorithm36 to yield a 3D volumetric reconstruction, as shown in Fig. 1d. To enhance the quality of the reconstruction, we either apply a predefined threshold or a deep neural network to remove ghosting artifacts introduced by the refocusing algorithm. For visualization, we opt for either depth- or time-encoding color schemes when appropriate, as in Fig. 1d, e. A detailed reconstruction diagram can be found in Section 2 of Supplement 1.

The deep neural network is a modified CM2Net9, which is developed to enhance the spatial resolution and improve the quality of the reconstruction. For the network training, we collect event streams of 200 µm thick phantoms embedded with 2 µm fluorescent particles moving at 2.5 mm s−1. Corresponding ground-truth volumes are obtained by axial scanning using conventional wide-field microscopy (20X, 0.75 NA) at 3 µm step size, resulting in a total of 4700 training pairs. The training uses a loss function composed of the Normalized Pearson Correlation Coefficient (NPCC) and Mean Absolute Error (MAE):

$${L}_{{total}}={L}_{{MAE}}-{L}_{{NPCC}}$$
(1)
$${L}_{{MAE}}=\frac{1}{n}\mathop{\sum }\limits_{i=1}^{n}\left|{y}_{i}-{x}_{i}\right|$$
(2)
$${L}_{{NPCC}}=\frac{\mathop{\sum }\nolimits_{i=1}^{n}({y}_{i}-\bar{y})({x}_{i}-\bar{x})}{\sqrt{\mathop{\sum }\nolimits_{i=1}^{n}{({y}_{i}-\bar{y})}^{2}}\sqrt{\mathop{\sum }\nolimits_{i=1}^{n}{({x}_{i}-\bar{x})}^{2}}}$$
(3)

where \({y}_{i}\) and \({x}_{i}\) denote the true and predicted values, and \(\bar{y}\) and \(\bar{x}\) represent their respective averaged values, with i indexing the pixels and n representing the total number of pixels. This dual loss is designed to simultaneously enhance spatial alignment and reduce intensity discrepancies. The network is implemented with PyTorch and runs on an Nvidia GPU RTX 4090, with a batch size of 8. The entire training process takes 24 h. The detailed network structure and implementations are provided in Section 10 of Supplement 1.

Preparation of brain slice

The preparation of rodent brain slices exhibiting GFP-tagged neurons within the bed nucleus of the stria terminalis (BNST) involved a series of detailed procedures. Initially, male C57Bl/6 mice were anesthetized with a steady inhalation of 2% isoflurane and received preemptive pain relief through buprenorphine and ketoprofen (0.5 and 5 mg kg−1 respectively). Then their heads were fixed within a digital stereotaxic apparatus (David Kopf Instruments, Tujunga, CA, USA). Following a midline incision on the skull and a small craniotomy bilaterally over each injection site, a viral vector carrying GFP (100–200 nL, pAAV-CAG-GFP, Addgene #37825-AAVrg) was injected into the BNST using a glass micropipette coupled with a Nanoject II injector (Drummond), targeting specific brain coordinates 1.0 mm lateral from the midline, 0.4 mm anterior to bregma, and a depth of 4.3 mm from dura. The retrograde virus labeled both local neurons at the injection site in BNST as well as upstream brain areas. Post-surgery, the mice were allowed a recovery period of two to three weeks for the virus to manifest before being deeply sedated for the final procedure. Their brains were then fixed in a paraformaldehyde solution, cryoprotected, and sectioned coronally at 75 μm thickness using a cryostat. These sections were finally mounted on Superfrost Plus slides (Fisher) and Vectashield mounting medium (Vector Labs).

This study was performed in strict accordance with the recommendations in the Guide for the Care and Use of Laboratory Animals of the National Institutes of Health. All animals were handled according to approved Institutional Animal Care and Use Committee (IACUC) protocols (#201800540) of Boston University.

Preparation of fluorescent beads phantom

To prepare the fluorescent particle phantom, we accurately pipette 5 μL of fluorescent beads (2 μm diameter, 1% concentration, Fluoro-Max Dyed Green Aqueous Fluorescent Particles) into 2 ml of clear resin (Formlabs, catalog no. RS-F2-GPCL-04), contained within a suitable tube. The mixture is then homogenized by ultrasonic probe sonicator (Fisherbrand™ Model 50 Sonic Dismembrator) to ensure a uniform distribution of beads within the resin. Subsequently, the solution is carefully dispensed into a rectangular mold, designed with a 200 μm depth, until the mold is filled. A glass slide is then placed over the mold to cover it. The resin-bead solution is solidified by exposing it to UV light. Finally, the glass slide is removed to retrieve a uniform phantom with a precise thickness of 200 μm.

Preparation of C. elegans

The transgenic Caenorhabditis elegans (C. elegans) strain ZB4510 [mec-4p::GFP], genetically modified to express green fluorescent protein (GFP)47, are cultivated on nematode growth medium (NGM) agarose plates. The plates are coated with Escherichia coli to provide a consistent food source. We adhere to a strict subculturing routine, transferring the C. elegans to fresh media every three days. Before imaging, a 20 μL layer of 2% agarose solution (Sigma-Aldrich) is prepared on a 15 mm by 15 mm section of a glass microscope slide to form a pad. A small population of C. elegans is then transferred onto this agarose pad, and a drop of S-Basal is applied to facilitate free movement of the C. elegans within the solution. The prepared slide is immediately placed on the stage of the microscope for imaging.