JRE has JVM, Garbage collector, JIT compiler
JVM - young generation, old generation, permanent generation (metaspace).
Garbage collector - Serial, Multithreaded, CMS, G1
- single threaded stop the world young generation collector
- single threaded stop the world old generation collector
- Multithreaded stop the world young generation collector
- Single/multi threaded stop the world old generation collector
- Server class machine default collector
- Multithreaded young & old generation collector
- 2 GB
- 2 Virtual CPU's
- One exception is on Windows 32 bit the default is the serial collector.
- Supported as of Java HotSpot 7 update 4
- Region based multithreaded stop the world young generation collector
- Combination of a mostly concurrent and stop the world old generation collector
- Takes Java bytecode and generates native code for underlying platform
- Huge performance improvement realized from JIT compilation
- Client - rapid startup
- Server - highly
- Tiered - best of both, enabled via -XX:+TieredCompilation (default for Java 8)
Eden - new objects are allocated here, when its full or exhausted minor GC occurs. Then objects are copied to "From Survivor" space and to "To Survivor" space.
From Survivor - objects move to "To Survivor space"
To Survivor - move to "Old Generation" space after several minor GC's
Old Generation - Objects are collected during full GC, its longer and
for older/longer-living objects
Perfmanent Generation (not heap)
jvisualvm + Visual GC + Memory Pools
Operating Systems Performance Metrics to Monitor
CPU usage, including user CPU, system CPU and idle time
Virtual memory usage
Process behavior, especially context switching and CPU scheduling and thread migraiton
Disk I/O, if Disk I/O is involved with the application
Network I/O, if network traffic is involved with the application
Tools: vmstat 5, mpstat, vm_stat, top, monitor
An app that spends more than 5% of its available clock cycles in involuntary context switching is likely suffering from lock contention.
Can be as much as 80K clock cycles
CPU Scheduler Run Queue
App thread on queue, if app threads larger than vCPU's then it builds up in run queue. If double the run queue will degrade performance
How to correct:
Add additional processors
Add additional systems
Bring back the load
JVM Performance metrics to monitor
Garbage collection is the by far most intrusive activity upon the application.
--XX:+PrintGCDateStamps or --XX:+PrintGCTimeStamps
--Xloggc if you'd like to direct output to a log file
VisualVM's VisualGC to observe garbage collection activity.
Monitoring app exe time and stopped time
Identify when jvm stopped all threads to gc
Others to consider but not necessary
Fine tuning JVM heap space sizes
--XX:+PrintAdaptiveSizePolicy (parallel gc or g1 only)
acts as a agent
requires a security manager and security policy file
user creds of jstatd should be of same as app
App metrics to monitor
app level jmx mbeans
Object allocation occurs in Eden and once it becomes full a minor GC occurs, and objects are moved to From Survivor space and To Survivor space.
Once objects are lived for some period of time they are promoted to Old generation.
And once Old generation is filled a GC is required.
Both young and old generation GC is stop the world events.
Default GC for server class machines (2 vcpu and 2 gb)
--XX:+UseParallelGC - default GC in Java 6 through Java 7 Update 3.
* Is a multi-threaded young generation GC and single threaded old generation GC.
* Full GC takes longer than minor GC.
--XX:+UseParallelOldGC - default in Java 7 Update 4 and later.
* Is a multi-threaded young generation GC and multi threaded old generation GC.
Note: PermGen is also collected on full GC.
** Reduce full GC collections as much as possible.
young to old generation.
End of minor GC Eden is empty
When Old Gen cross some threshold major GC occurs
Multi-threaded Young and concurrent Old Gen.
Some phases are single threaded, stop the world.
Initial-mark phase, stop the world and single threaded.
Concurrent marking phase, concurrent and multi-threaded.
Pre-cleaning phase - concurrent and multi-threaded.
Remark phase - stop the world and multi-threaded.
Concurrent sweeping phase, concurrent and multi-threaded.
Concurrent collection does not compact old gen.
Stop the world, single threaded old gen compaction occurs if concurrent cycle not keeping up, or old gen is too fragmented for promoted object to fit, accomplished via a single threaded, stop the world, full gc.
--XX:+UseConcMarkSweepGC - explicitely use this.
multi threaded minor gc, conc old gen 70% space.
-Rather than having a physical GC spaces such as eden, survivor and old gen, G1 divides one large --contiguous space into many fixed size regions.
-Regions are designated as an eden, survivor and old region.
-Two additional types of regions: available/unused regions and humongous regions for large objects.
- conc old gen gc, all but on phase 'remark' is stop the world, but very very quick.
- initial mark phase
- root region scanning
- conc marking phase
- remark phase stop the world
- cleanup phase, stop the world
-Xms -Xmx min and max heap size.
--XX:MaxGCPauseMillis defaults to 200 MS - not a gurantee
--Xmn, --XX:NewSize/-XX:MaxNewSize --XX:SurvivorRatio
how to measure? transaction or message per second?
What is the expected throughput?
How long do u need to sustain that expected throughput
What's considered unacceptable throughput
Round trip time? http request/res
Footprint (how much memory can use)
How much memory can be used by java app?
how much ram available?
Determine memory footprint
- Determine Live data size? (heap)
- -XX:+UseParallelOldGC and collect GC statistics --XX:+PrintGCDetails and -XX:+PrintGCDateStamps
If u know max RAM and # jvms running on the system
then -Xms3g -Xmx3g then 1g for OS, use 80% ram.
coz of young, old or perm
Live data size?
Use VisualVM or JConsole for Full GC.
* Occupancy of Old gen space after full GC
ParOldGen: 420M->290M(640M) -- Live data size
PSPermGen: 32M->32M(64M) -- Live data size
Worst case latency - seconds to run gc
Determine Java Heap Size
-Xms and -Xmx to 3x to 4x the live data size.
-Xmn young gen to 1x to 1.5x the live data size.
old gen to 2x to 3x the live data size.
In other words, young gen should be about 1/3rd or 1/4th of -Xms or -Xmx
Live data size : 512m
-Xms2g -Xmx2g -Xmn768m
-XX:PermSize & -XX:MaxPermSize to about 1.2x to 1.5x the max perm gen size.
----- Sample Appliance ----
-Xms1024m -Xmx1024m -XX:PermSize=256m -XX:MaxPermSize=256m -Dcom.sun.management.jmxremote -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/appdirectorLongevityError.dump
vCPU - 2
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 45
model name : Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz
stepping : 7
cpu MHz : 2700.000
cache size : 20480 KB
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology tsc_reliable nonstop_tsc aperfmperf pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 popcnt aes hypervisor lahf_lm ida arat epb pln pts dts
bogomips : 5400.00
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual