第九章神经元网络方法及其 应用举例 Questions: what can-do or can-not-do of a von Neumann machine? Adapting to circumstances Fault tolerance Massive parallelism Doing precisely what the programmer programs them to do Interacting with noisy data or data from the environment Fast arithmetic Not so good atGood at Where can ANN systems or “Brain” help? - where we can't formulate an algorithmic solution. - where we can get lots of examples of the behaviour we require, 大样本训练 - where we need to pick out the structure from existing data. e.g. Pattern recognition (recognizing handwritten characters?) 无数学模型描述的复杂系统识别,YES or NOT? Main Injector & Recycler Tevatron Booster ?p p D? D? D? ?p source Batavia, Illinois Chicago Tevatron:Fermilab Proton-Antiproton Collider 1)Linear Accelerator 2)Booster 3)Main/Injector 4)Antiproton Source 5)Tevatron @1.96TeV 6)CDF+D? A Schematic Hadron Collider Detector Hadronic layers Tracking system Magnetized volume Calorimeter Induces shower in dense material Innermost tracking layers use silicon Muon detector Interaction point Absorber material Bend angle → momentum e/γ Experimental signature of a quark or gluon μ Jet: q or g “Missing transverse energy” Signature of a non-interacting (or weakly interacting) particle like a neutrino or LSP EM layers fine sampling p ?p The D? RunII Detector New for Run II: ?Magnetic Tracker: SMT, CFT, 2T Solenoid ?Preshower ?Forward Muon ?Trigger & DAQ Retained from Run I: ?U/LAr Calorimeter ?Central Muon Detectors ?Muon Toroid Vertex & Central Tracking 1. Silicon Microvertex Tracker : ~ 10μm(design) 2. Central Fiber Tracker: scintillator fiber η=?ln(tg(θ/2)) The Calorimeter Z y x θ ? ? Liquid Argon sampling 9 Stable, uniform response 9 LAr purity ? Uranium absorber (Cu/Fe for coarse hadronic) 9 dense absorber hence can be compact 9 Compensated EM and hadronic response 9 Linear response ? Hermetic with full coverage 9 |η| < 4.2 (θ≈2 o ) Resolution: σ/E ~ 15%/√E(GeV) “fine” EM 50%/√E(GeV) “coarse” jet σ MET ~ a + b*S T +c*S T 2 (run1) S T scalar sum of ET a~1.89GeV, b ~6.7E-3, c~9.9E-6/GeV Muon Detector ? Two regions & Three layers of Scintillator and Drift Tubes – Central and Forward – A Layer inside Toroid magnet – B & C Layer outside Toroid magnet ? Muon rapidity coverage to ±2 ? Shielding reduces backgrounds by 50-100x ? Coarse Local pT resolution J/Psi: Local/Global What we can “see”? 2. “Track” (CFT+SMT) : you only know they are stable charged, but e,m,P,K,p? 3. “Vertex” (SMT) : PVrt/Svrt1. “Jet” (CAL) : you won’t see Quark/gluon or Electron/photon cell -> “Tower” -> “Jet” hadronic(0.5) vs. γ/e(0.2) 4. “muon” (Muon) 5. “MET” (CAL) : Σ E cell *sinθ w/o muon Good at M.C.问题:e,μ,τ,夸克信号特性? Not-so-Good at 实际问题:Pattern Recognition 探测器中unit/cell ADC readouts ~O(10 6 ), Sigal/Background entangled 粒子鉴别ID ? No formula, no theory ANN: 经验与训练 Generator level: Z?ee,μμ,ττ,qq(di-jet) GEANT level: -不稳定粒子衰变(e -it/τ ) -粒子与物质相互作用 电离,韧致辐射,强作用?能量损失 Digitize level: 能量沉积?探测器unit/cell 电子学ADC读出 Reconstruction level: unit/cell ? grouped ? cluster ?信号shape/pattern,能量+方向 人脑:学习与经验主义 Neural networks are a form of multiprocessor computer system, with - simple processing elements - a high degree of interconnection - simple scalar messages - adaptive interaction between elements y 黑匣子 x 1)多输入x i ,描述被识别对象 2)单输出y,被识别对象ID 3) “黑匣子”: -输入整合: -输出active function, y j =g(a j ), eg j k kjkj txwa += ∑ 各输入道权重 输出道阈值 ? ? ? < ≥ = θ θ x x xg 当 当 ,0 ,1 )( () () ∑∑ ?= Pi ii PAPyE 2 2 1 功能函数输出 实验测量真值 学习: -正向计算y i , E -反向调节w ij ,t j , ie Back-Propagated (BP) -直至E最小 4) 极值原理 -输入I 0 ,I 1 =0,1 -权重W 0 ,W 1 ,阈W b a=W 0 *I 0 + W 1 * I 1 + W b -激发函数:step function y=1 a>0 0 a<=0 -真值(训练目标) A = I 0 OR I 1 Not function well Need more complex network 输入层隐藏层输出层 k x j h t y j k kjkj txwa += ∑ )( jj agh = i k kiki thwa ~ ~~ += ∑ ) ~ ( ii agy = 输入?隐藏: 相同的基本构成与 active function 隐藏?输出: 反向传播网络训练Back-Propagated (BP) 0. 极值原理:误差函数E最小?gradient descent ji ji E w w η ? ?=? ?   () () () ii ij iij pp ji E y Agah gah w δ ? ′′ =? = ? ∑∑   0 ji ji ww+=? →  1.输出层?隐藏层: 学习强度 2.隐藏层?输入层: ∑∑∑ ′′ = ′′ = ? ? P kjjk Pi jjiii kj xagxagwag w E )()( ~ ) ~ ( δδ 0 kj kj E w w η ? +=? → ? 学习强度η (Learning Rate ) If too small, long time to converge If too large, cause the algorithm to diverge, an overflow error in the computer's floating- point arithmetic [0.001,0.5],先大后小 summary -真值A i ==true || false -输入?隐含层?输出 -输入与权重a i = w ij *x j -激发函数与输出y i = g(a i ), 强制性是非选择与放大 -真值比较?反向传播?误差梯度递减调节?固定w ij Tx e xg / 1 1 )( ? + = ? ? ? < ≥ = θ θ x x xg 当 当 ,0 ,1 )( ( )Txxg tanh)( = 高能强子对撞中的τ鉴别 Last missing piece: m=f/a? Higgs? Vacuum? 真空对称性自发破缺 3 generations of quarks 3 generations of leptons 3 types of interactions - Electromagnetic γ - Weak W/Z - Strong g “matter” “force carrier” 基本粒子与探测 强子谱(1): - gluon胶子:m==0 g ? qqbar/gg - up, down夸克:m~1MeV, π介子,η,ρ,φ,ω …etc,etc π ± , cτ = 8m π 0 ?2γ, Br~100% - strange夸克:m~100MeV, K介子 K ± , cτ = 4m K 0 -K 0 bar mixing ? K L -K S CP-violation - charm夸克:m~1GeV D介子, 不稳定 ccbar?J/ψ(1S)?μμ, Br~6% 原初“jet”喷注 Cone 0.5 - bottom夸克:m~5GeV, B介子, b-tagging cτ = 500μm γ = E/m >~ 10 Lorentz延时 强子谱(2): ※ PV(primary Vertex)~25μm B ※ SV(Secondary Vertex)~5000μm multiple tracks in Cone0.5-0.7 2次顶点判选& 电荷中心法±1/3 etc ~ 50% efficiency -top夸克:m~180GeV, Γ ~ 3GeV, 寿命极短,不强子化 t ? b + W (semi-lepton decay lν, l=e,μ) ~100% (10%) 轻子谱(1): -中微子ν e , ν μ ,ν τ :m~0, weak interaction ? missing transverse energy -电子e:m~0.511MeV, stable strong bremsstrahlung in high Z material ? compact in Calorimeter, ie lose most energy in Cone 0.2 cut on isolation, em-fraction, shower shape hmatrix - μ:m~113MeV, cτ=658m Track + MIP in Calorimeter + penetrate out-most Muon chamber 轻子谱(2): - τ:m~1.776GeV, cτ=90μm tau -> l - ν l ν (l=e,μ)~17.5% h - ν + neutral ~50% h - h + h - ν + neutral ~15% -> π-“1-prong” -> ρ-“3-prong” 1-prong problem: - 1 track, no 2 nd Vertex reconstruction - 50% hadron jet, a little more compact and isolated than initial quark/gluon jet …but how to describe, Cone 0.4, 0.5 or 0.7? Challenge to id hadronic tau decay on hadron collider! Di-tau study: light Higgs decay BR of SM Higgs - Yukawa couplings Γ ~ m 2 /M W 2 = m 2 /80 2 [GeV 2 ] - bb channel: 90% Br, O(1pb) v.s. background O(1mb) S/B ~ 1:1E9 - ττ channel: 10% Br, O(0.1pb) v.s. background O(100pb) S/B ~ 1:1E3 高能强子对撞中ANN-τ鉴别 输入变量: - profile = ET Tower(1+2)/ETtot (i.e. broader than electron) - trkiso = PT of tracks, excl the τ 1-prong, in Cone0.7 - Et/pt, ringiso, e1e2, dalpha, EM12fr etc 真值比较: -MC:signal τ ?π ± ν, π ± π 0 ν vs. q/g/b jet background - data:signal Z ?ττ? e/μ + h, cuts as E T (l)>12, M(lh)<60, df>2.5, unlike-sign(l*h) background all selections except like-signal instead Tuning the w-est variable: σ(Zττ, π-type) = 235 ±137 pb σ(Zττ, ρ-type) = 222 ±71 pb σ(Zμμ) = 261.8 ±5.0 ±8.9 ±26.2 pb - No way from MC ? fast MC ID, but a substitute - ANN gives the best result of Zττ π-type as calibration with efficiency ~ 80% A way to use lifetime for tau id? CT[tau] ~ 80 μm DistanceClosestApproch~40 σ~25 The goal Higgs ?ττ Z?μμ data Z?μμ MC Z?ττ 1-Prong ΜC - double lifetime information by sumDCA of di-tau,might help - sumDCA>>resolution, should discriminate from no-lifetime di-track system. - ANN is the only choice to model, “exaggerate” and tune the cut