Trigger Merging Module for the J-PARC E16 Experiment by Ichikawa, M. et al.
1Trigger Merging Module
for the J-PARC E16 Experiment
M. Ichikawa, T. N. Takahashi, K. Aoki, S. Ashikaga, E. Hamada, R. Honda, Y. Igarashi, M. Ikeno, D. Kawama,
M. Naruki, K. Ozawa, H. Sendai, K. N. Suzuki, M. Tanaka, T. Uchida, and S. Yokkaichi
Abstract—An experiment to measure an invariant mass of φ
mesons in nuclear medium is planned as the J-PARC E16 experi-
ment. A trigger merging module (TRG-MRG) has been developed
to detect leading-edges from 256 channels of discriminator-output
signals and transmit those serialized hit data to trigger decision
module with four optical links. The result of the test shows
enough performance of the TRG-MRG as 1 ns TDC and data
multiplexer with four 6.25Gbps transceivers.
I. INTRODUCTION
A major part of the hadron mass in vacuum is consid-ered to be originated from the spontaneous breaking
of chiral symmetry characterized by quark condensate. It is
expected that, at finite density or high temperature, the broken
symmetry restores partially and the hadron mass is modified.
The J-PARC E16 experiment measures an invariant mass
of φ mesons in nuclear medium and investigate the partial
restoration of breaking of chiral symmetry at nuclear density
[1], [2].
In the experiment, φ mesons are produced in nuclei by the
exposure of 30 GeV proton beam of 1 × 1010 /pulse, with a
duration of 2 s, to nuclear targets at the high momentum beam-
line at J-PARC. We measure φ mesons in the electron-positron
decay channels and reconstruct the invariant mass.
A. Setup of Spectrometer
Figure 1 shows a top view of the spectrometer. Four
types detectors are used for momentum measurement of elec-
tron/positron. From an inner side, silicon-strip detectors (SSD)
and three layers of GEM trackers (GTR1, 2, 3) are located
Manuscript received June 24, 2018. This work was performed by the
RIKEN Junior Research Associate Program, a Grant-in-Aid for JSPS Fellows
(12J01196), and MEXT/JSPS KAKENHI (Nos. 21105004, 26247048, and
15K17669).
M. Ichikawa, S. Ashikaga, M. Naruki, and K. N. Suzuki are with Depart-
ment of Physics, Kyoto University, Kitashirakawa Sakyo-ku, Kyoto 606-8502,
Japan.
M. Ichikawa is with Riken Cluster for Pioneering Research, RIKEN, 2-1
Hirosawa, Wako, Saitama 351-0198, Japan
T. N. Takahashi is with Research Center for Nuclear Physics (RCNP),
Osaka University, 10-1 Mihogaoka, Ibaraki, Osaka, 567-0047, Japan.
K. Aoki, E. Hamada, Y. Igarashi, M. Ikeno, K. Ozawa, H. Sendai, M.
Tanaka, and T. Uchida are with Institute of Particle and Nuclear Studies,
KEK, 1-1 Oho, Tsukuba, Ibaraki, 305-0801, Japan
R. Honda is with Department of Physics, Graduate School of Science,
Tohoku University, Sendai, Miyagi 980-8578, Japan
S. Yokkaichi is with RIKEN Nishina Center, RIKEN, 2-1 Hirosawa, Wako,
Saitama 351-0198, Japan
J-PARC E16実験 5
• Tracking 
- SSD 
- GEM Tracker 
➡Mass resolution < 10.0 
MeV/c2 
• Electron ID 
- HBD 
- Lead Glass calorimeter 
➡pion rejection < 10-3
HBD LG
SSD
GEM2GEM1
GEM3
断面図 3D
× 8 modules   (Run1) 
× 26 modules (Run2)
1 Module 拡大図
(HBD)
Top View
Enlarg d View
GTR1 GTR2
GTR3
HBD
Charged Particle
Cherenkov Photon
Momentum 
Measurement Electron ID
B
SSD
Return yoke
Beam
Target
GEM Trackers
HBD
SSD
Lead Glass 
Calorimeter
Fig. 1. (Top) Top view of the spectrometer. (Bottom) Enlarged view of the
one-module of detectors.
in the strong magnetic field for flight-path detection and mo-
mentum reconstruction [3]. The electron/positron identification
is performed with hadron-blind detectors (HBD) and lead-
glass calorimeters (LG) [4]. The number of readout channels
is 112,996 and waveform data from all types of detectors
are taken to solve piled up signals. The waveform data are
buffered with modules using APV25-S1 chips [5] and DRS4
chips [6] with the buffering-time of 4µs and 2µs, respectively
[7]. Therefore required latency for the trigger signal is less
than 2µs.
B. Trigger System
For the trigger generation, discriminator-output signals from
GTR3, HBD, and LG are used. The number of trigger channels
ar
X
iv
:1
80
6.
10
67
1v
1 
 [p
hy
sic
s.i
ns
-d
et]
  2
5 J
un
 20
18
2GTR3
HBD
ASD
ASD
LG DRS4 ADC/TDC
Trigger 
Merging 
Module
GTR3 ASD
Trigger 
Merging 
Module
GTR3 ASD
Trigger 
Merging 
Module
GTR3 ASD
Trigger 
Merging 
Module
Trigger 
Merging 
Module
HBD ASD
Trigger 
Merging 
Module
HBD ASD
Trigger 
Merging 
Module
HBD ASD
Trigger 
Merging 
Module
Trigger 
Merging 
Module
LG DRS4 ADC/TDC
Trigger 
Merging 
Module
LG DRS4 ADC/TDC
Trigger 
Merging 
Module
LG DRS4 ADC/TDC
Trigger 
Merging 
Module
Trigger 
Decision 
Module 
Belle-2 UT-3
Trigger 
Distribution 
Module 
Belle-2 FTSW
to Readout System
Maximal 64 optical cables
2,620 LVDS signals
Trigger 
Distribution 
Module 
Belle-2 FTS
Fig. 2. Overview of the trigger system.
TABLE I
THE REQUIREMENT TO THE TRG-MRG.
Point Value
Function TDC + Optical Transceiver
Cable Reduction 2,620 LVDS → <64 optical cables
Single Rate 1 MHz/ch
Sampling Time <3 ns
Latency <500 ns
is 2,620. The maximal single rate is expected to be typically
1 MHz/ch and the minimal width of the discriminator-output
signals is 3 ns. Therefore the sampling time must be less than
3 ns. Overview of the trigger system is shown in Fig. 2. Analog
signals from detectors are discriminated by DRS4 ADC/TDC
or ASD (Amplifier-Shaper-Discriminator) developed for the
experiment [8]–[10]. In the trigger merging modules, called
TRG-MRG, leading-edges are detected and serialized data of
them are transmitted to a trigger decision module by optical
transceivers. Belle II UT3, that has 16 QSFP+, is used for
the trigger decision module [11]. Finally, the trigger signal is
distributed to readout modules by Belle II FTSW [12].
The latency before the TRG-MRG is estimated to be 600 ns,
mainly due to the drift time of GTR3. To make the latency
less than the waveform buffering-time of 2µs, we design the
latency of the detection of the leading-edge and transmitting,
trigger decision, and trigger distribution as 500 ns, 500 ns, and
300 ns, respectively. With the design value, the total latency
including the drift time of GTR3 becomes 1,900 ns.
The requirement to the TRG-MRG is summarized in TA-
BLE I.
II. DEVELOPMENT
Figure 3 is a picture of the TRG-MRG. The module consists
of one main board and two mezzanine cards. The mezzanine
card has four 32 channels LVDS receivers and converters from
LVDS to 1.8 V LVCMOS format and is replaceable according
to the formats of the input connectors. In the main board,
two FPGAs (Xilinx Kintex-7 160T-2 and Xilinx Spartan-3
50AN-4), two crystal oscillators of 125 MHz, and eight SFP+
transceivers are installed [13], [14]. The 1 ns and 256 channels
multi-hit TDC and 6.25 Gbps and four-lanes GTX transceivers
are implemented in the Kintex-7 by Vivado2017.2 provided by
Xilinx. The channel reduction from 2,620 channels to maximal
64 optical transceivers is realized by using about 15 TRG-
MRGs.
SFP+ Transceiver ×8
FPGA: 
Xilinx Spartan3 50AN-4
RJ45: for Remote JTAG
RJ45: 
<-> FTSW
Connector to Mezzanine ×2
NIM I/O ×2
12 V Power Supply
DC/DC
125 MHz 
Crystal Oscillator
FPGA: 
Xilinx Kintex7 160T-2
Main Board Mezzanine Card
128 ch LVDS Receiver
LVDS->1.8V LVCMOS 
Converter
Fig. 3. The picture of the TRG-MRG.
250 MHz500MHz DDR Deserializers        
Edge Detecters                         
Delay Controllers(up to 1,024ns)
Hit 
Buffer
Aurora8b10b ×4 lanes               
256 ch Input
Hit 
Buffer
Hit 
Buffer
FIFO                                          
TDC
Transceiver
Hit 
Buffer
156.25 MHz
To SFP+
32 bit × 5 cycle / 64 ns
Fig. 4. The diagram of the firmware implemented in the TRG-MRG.
A. Firmware
The diagram of the firmware implemented in the FPGA is
shown in Fig. 4. The firmware of TDC and transceiver sections
is explained in the following paragraphs.
1) TDC Section: The TDC section consists of deserializers,
edge detectors, delay controllers, and hit buffers. The input
signals are sampled with 1 ns by 500 MHz DDR (Double
Data Rate) deserializers of Vivado IP core, ISERDESE2. The
component converts 1 Gbps to four 250 Mbps. In the edge
detector, leading-edges are detected from each 4 bits data.
To calibrate the intrinsic time difference among channels,
delays are added in the delay controller. The component is
implemented by using RAM based shift register to save the
number of flip-flops and able to delay the data of each channels
up to 1,024 ns with a 4 ns unit. If the leading-edges are
detected, the hit timing and channel number data are buffered
in the hit buffer. Maximal eight hits data are buffered for 64 ns
in each 64 channels. The efficiency of event transfer with this
criteria is discussed in Sec.III-G. The data to the transceiver
section have the width of 32 bits and are output during 5 cycles
in each 64 channels.
2) Transceiver Section: The transceiver section consists of
FIFO and Aurora8B/10B protocol [15]. The Aurora8B/10B
is a link-layer protocol for high-speed serial communication.
Because the clock frequency in the protocol is determined
from the line rate and lane width of the transceiver, FIFO is
installed for clock domain crossing. In the Aurora transmit-
3Function 
Generator Delay
DRS4 
discri TRG-MRG
1kHz矩形波 
幅~50ns
Discri
Function 
Generator Delay TRG-MRG
1kHz Pulse 
width: ~50ns
Discriminator
Fig. 5. The schematics of the circuit setup for the measurement of the time
resolution.
h0
Entries  1463002
Mean    16.05
Std Dev    0.2212
14.5 15 15.5 16 16.5 17 17.5 18 18.5
Time difference [ns]
0
200
400
600
800
1000
1200
1400
310×
Co
un
t
Fig. 6. An example of the distribution of measured time difference.
ting, the 32 bit/lane data are encoded to 40 bit/lane data and
serialized. The data are deserialized and decoded at the stage
of data receiving.
III. PERFORMANCE EVALUATION
Items of evaluated performance are described below.
A Time resolution
B Integral non linearity (INL)
C Differential non linearity (DNL)
D Minimum pulse width
E Double pulse separation
F Latency
G Transfer efficiency
A. Time Resolution
The time resolution was evaluated by inputting two signals
with a fixed delay to two channels of the TRG-MRG as
illustrated in the Fig. 5. The time difference of the output
from the TRG-MRG was measured as shown in Fig. 6. The
time resolution is defined from the distribution as σ/
√
2, σ is
defined as the standard deviation of the distribution. Even if the
TDC has no clock jitter, the time differences distribute at least
2 LSB (Least Significant Bit), which is called as a quantization
error. The quantization error depends on the remainder of
(time difference) / LSB, represented as tin in this paper, as√
tin(1− tin). The measured distribution is in good agreement
with the expected quantization error as shown in Fig. 7. The
time resolution of better than 0.35 ns are obtained.
B. Integral Non Linearity
The integral non linearity (INL) was estimated by the same
data described in Sec. III-A. Figure 8 shows the relation
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t_in [ns]
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Ti
m
e 
re
so
lut
ion
 [n
s]
Fig. 7. The tin dependency of the time resolution. The black points indicate
the measured value and the red line is the calculated value of quantization
error.
0 200 400 600 800 1000 1200 1400 1600 1800 2000 2200
Time difference between input signals [ns]
200
400
600
800
1000
1200
1400
1600
1800
2000
2200
Ti
m
e 
dif
fe
re
nc
e 
be
tw
ee
n 
ou
tp
ut
 si
gn
als
 [n
s]
 / ndf 2χ  29.17 / 35
Prob   0.745
p0        0.003807± 0.7259 
p1       06− 5.194e±     1 
Fig. 8. The relation between input time difference and output time difference.
The black points indicate the measured value and the red line is the result of
fitting.
between input time difference and output time difference. By
fitting the measured points as At+ B and calculating the
residual between the measured points and the fitting line, the
INL was estimated as the maximal value of the residual of
[−0.04 LSB, +0.04 LSB] (fig. 9). The effect of INL turned
out to be negligible for the performance.
C. Differential Non Linearity
In the TRG-MRG, the differential non linearity (DNL) is
expected to be originated from the deserializer. The accuracy
of the output from clock generator and the skew of intercon-
nection length in the deserializer make the DNL worse. As
mentioned above, the input data are deserialized to 4 bits in
each 4 ns. Therefore, the DNL is expected to have a periodicity
of 4 ns. The DNL measurement was performed by code density
test with a clock with the period of 80.008 ns. The edges of
the input clock is expected to distribute with the interval of
0.008 ns into the expected periodicity of 4 ns. The distribution
of the those edges is shown in Fig. 10. As expected, the 4 ns
periodicity is seen. The DNL was estimated at [−0.022 LSB,
410 210 310
Time difference between input signals [ns]
0.06−
0.04−
0.02−
0
0.02
0.04
0.06
Re
sid
ua
l [L
SB
]
Fig. 9. The result of the INL measurement.
0 1 2 3 4 5 6 7
Remainder of Timing / 8 [ns]
890
895
900
905
910
915
920
925
930
310×
Co
un
t
TIME%8
Fig. 10. The distribution of the edges of the 80.008 ns clock.
+0.022 LSB], as shown in Fig. 11. The effect of DNL turned
out to be negligible for the performance.
D. Minimum Pulse Width
As mentioned previously, the TRG-MRG must detect the
narrow signal of 3 ns, expected in the experiment. By inputting
0.5− 0 0.5 1 1.5 2 2.5 3 3.5
Remainder of timing / 4 [nd]
0.02−
0.01−
0
0.01
0.02
DN
L 
[L
SB
]
Fig. 11. The result of DNL measurement.
2.5 ns
Input Signal
Discriminate
Threshold
Fig. 12. A sketch of the input signals. The signals are shown as single end
and negative logic for the simplification.
※Using circuit simulator in Vivado2017.2
レイテンシ 15
回路シミュレーターを用いて確認
Receive discriminator output
Transmit data
64 ns 179 ns
64 ns0 ns
Detect leading edges -> Buffering for 64 ns
Fig. 13. The result of the estimation of the latency of the TDC section.
such narrow signals, the detection efficiency was measured.
As the result, it is understood that the TRG-MRG can detect
signals of 1.0 ns width in 100% efficiency. Therefore, the
performance of the TRG-MRG about minimum pulse width
satisfies the requirement from the experiment.
E. Double Pulse Separation
The double pulse separation was estimated by measuring the
signal detection efficiency with changing the width between
the trailing-edge of first signal and the leading-edge of second
signal, shown in Fig. 12. The TRG-MRG can discriminate two
signals with the interval of 2.5 ns in 100% efficiency. From the
result, the inefficiency due to the TRG-MRG is estimated to
be 0.27% at the worst case.
F. Latency
The latency was evaluated in two sections, TDC and
transceiver sections as defined in Sec.II-A, separately.
1) TDC Section: The latency of the TDC section was
estimated by using a logic simulator in Vivado. The latency
before the TRG-MRG is 600 ns, which delays added by delay
controller is included. The result is shown in Fig. 13. Including
the buffering-time of 64 ns, the latency of the TDC section is
maximal 179 ns.
2) Transceiver Section: The latency of the transceiver
section is measured by inputting the output of 250 MHz
counter to the FIFO and receiving the data passing Aurora and
optical cable of 1 m (expected length). The latency is mainly
defined from necessary time for data receiving (deserializing
and decoding). Figure 14 shows a schematic view of the
measurement. The data are transmitted in five cycles (20 ns)
continuously in each 16 cycles (64 ns) of 250 MHz clock,
corresponding to the transmission in the experiment.
The obtained result is shown in Fig. 15. The latency of
290 ns is obtained for 99.8% data, which is consistent with the
5Aurora8b10b × 4 lanes      
FIFO
FPGA
SFP+ SFP+ SFP+ SFP+
250MHz Counter
Optical Cable(1 m) PC
Entries         3.5e+07
Mean      288
Std Dev     3.692
285 290 295 300 305 310 315 320
410
510
610
710
latency {event<35000000}Latency of th  Transceiver Section
Latency [ns]
Count
FIFO SiTCP
RJ45
TRG-MRG
Latency
Fig. 14. The schematics of the circuit for the measurement of the latency of
the transceiver section. The RJ45 connector is attached to one SFP+ port.
285 290 295 300 305 310 315
Latency [ns]
410
510
610
710C
ou
nt
Fig. 15. The result of the measurement of the latency of the transceiver
section.
result from a measurement with a logic analyzer in Vivado.
On the other hand, the data of 0.2% have longer latency
of 310 ns. It seems to be due to the busy signal of Aurora
protocol for clock compensation. Clock compensation is the
basic function of the Aurora protocol and outputs busy among
three cycles (19.2 ns) in each 2,500 cycles of the clock for
Aurora of 156.25 MHz [15]. The results of the value and
percentage of the increase of the latency are consistent with the
expectation from clock compensation. As a result, the latency
of the transceiver section is maximal 318 ns.
3) Total Latency: The latency of the TDC and transmission
is estimated to be maximal 179+318 = 497 ns. It satisfies the
requirement of less than 500 ns.
G. Transfer Efficiency
As mentioned in Sec.II-A1, in the TRG-MRG, the hit data
are transferred according to the criteria that the maximal eight
hit data are buffered for 64 ns in each 64 channels. A detector
simulation with a simulator of passage of particle, Geant4,
was performed to estimate the transfer efficiency under the
expected experimental condition [16]–[18]. In the experiment,
the proton-beam intensity is 1 × 1010 /pulse and the maxi-
mal single rate reachs 1 MHz/ch. For considering the micro
h0
Entries  132808
Mean    2.094
RMS     1.491
Number of hits
0 2 4 6 8 10 12
Co
un
t
0
5000
10000
15000
20000
25000
30000
35000
h1
Entries  132808
Mean    3.615
RMS     1.949
Fig. 16. Simulated hit multiplicities for 64 ns with the beam intensity of
1× 1010 /pulse (full line) and 2× 1010 /pulse (dotted line).
Beam rate [GHz]
6 8 10 12 14
TR
G-
M
RG
 tr
an
sfe
r e
ffic
ien
cy
0.94
0.95
0.96
0.97
0.98
0.99
1
Fig. 17. The beam rate dependence of the transfer efficiency (simulation).
structure of the beam intensity, instantaneous beam intensity
distributes up to 2 × 1010 /pulse. Figure 16 shows the hit
multiplicities for 64 ns time windows under the beam intensity
of 1× 1010 /pulse (full line) and 2× 1010 /pulse (dotted line).
The fraction of discarded hits depends on the beam intensity.
The beam rate dependence of the transfer efficiency is shown
in Fig. 17. At the expected beam intensity of 1× 1010 /pulse
(5 GHz), the transfer efficiency is expected to be 99.95%. Even
at the intensity of 2 × 1010 /pulse (10 GHz), the efficiency
stays better than 98%. In conclusion, it is confirmed that he
developed module meets the required performance.
H. Summary of the Performance Evaluation
At the last of this section, the result of the performance
evaluation is summarized in TABLE II. All the results satisfy
the requirement of the experiment.
IV. CONCLUSION
The J-PARC E16 experiment is planned in order to inves-
tigate the partial restoration of breaking of chiral symmetry
at nuclear density. To handle a massive number of trigger
channels of 2,620, the trigger merging module, named TRG-
MRG, has been developed. The TRG-MRG consists of one
6TABLE II
THE RESULT OF THE PERFORMANCE EVALUATION.
Point Value
Time Resolution <0.35 ns
INL [−0.04 LSB, +0.04 LSB]
DNL [−0.022 LSB, +0.022 LSB]
Minimum Pulse Width 100% efficiency of 1.0 ns width
Double Pulse Separation 100% efficiency of 2.5 ns interval
Latency <497 ns
Transfer efficiency 99.95% at the expected condition
main board and two mezzanine cards and will be installed
between discriminators and the trigger decision module. It
works as 1 ns TDC and data multiplexer with four 6.25 Gbps
transceivers. From the results of the performance tests, for
example time resolution, latency, transfer efficiency, it is
confirmed that the TRG-MRG achieves the requirement for
the experiment which will be started in JFY 2019.
REFERENCES
[1] Y. Komatsu, et al., JPS Conf. Proc. 13 020005 (2017)
[2] Y. Morino, et al., JPS Conf. Proc. 8 022009 (2015)
[3] Y. Komatsu, et al., Nucl. Instr. and Meth. A 732 (2013) 241
[4] K. Aoki, et al., Nucl. Instr. and Meth. A 628 (2011) 300
[5] APV25-S1 User Guide Version 2.2, [online] available: https://cds.cern.
ch/record/1069892/files/cer-002725643.pdf
[6] DRS4 datasheet rev. 0.9, [online] available: https://www.psi.ch/drs/
DocumentationEN/DRS4 rev09.pdf
[7] T. N. Takahashi, et al., J. Phys.: Conf. Ser. 664 082053 (2015)
[8] Open-it R&D Project, ADC HRTDC with DRS4[online] available: http:
//openit.kek.jp/project/DRS4ADC/public/drs4adc-public
[9] Y. Obara, et al., J. Phys.: Conf. Ser. 664 082043 (2015)
[10] Open-it R&D Project, HBD trigger ASIC[online] available: http://openit.
kek.jp/project/e16 hbd trigger/e16 hbd trigger
[11] Belle II Technical Design Report, [online] available: https://arxiv.org/
pdf/1011.0352.pdf
[12] M. Nakao, 2012 JINST7 C01028
[13] 7 Series FPGAs Data Sheet: Overview, [online] available:
https://japan.xilinx.com/support/documentation/data sheets/ds180
7Series Overview.pdf
[14] Spartan-3AN FPGA Family Data Sheet, [online] available: https://www.
xilinx.com/support/documentation/data sheets/ds557.pdf
[15] Aurora 8B/10B v11.1 LogiCORE IP Product Guide, [online] avail-
able: https://japan.xilinx.com/support/documentation/ip documentation/
aurora 8b10b/v11 1/pg046-aurora-8b10b.pdf
[16] S. Agostinelli, et al., Nucl. Instr. and Meth. A 506 (2003) 250
[17] J. Allison, et al., IEEE Trans. Nucl. Sci. 53 (2006) 270
[18] J. Allison, et al., Nucl. Instr. and Meth. A 835 (2016) 186
