深入理解 java 内存模型

Posted on 2018-10-21 | In java

深入理解 Java 内存模型一：基础

Doug Lea 关于JSR-133内存模型的说明 The JSR-133 Cookbook for Compiler Writers。

并发编程模型的分类

在并发编程中，我们需要处理两个关键问题：

线程之间如何通信及线程之间如何同步（这里的线程是指并发执行的活动实体）。

通信是指线程之间以何种机制来交换信息。在命令式编程中，线程之间的通信机制有两种：

共享内存和消息传递。

在共享内存的并发模型里，线程之间共享程序的公共状态，线程之间通过写-读内存中的公共状态来隐式进行通信。在消息传递的并发模型里，线程之间没有公共状态，线程之间必须通过明确的发送消息来显式进行通信。

同步是指程序用于控制不同线程之间操作发生相对顺序的机制。在共享内存并发模型里，同步是显式进行的。程序员必须显式指定某个方法或某段代码需要在线程之间互斥执行。在消息传递的并发模型里，由于消息的发送必须在消息的接收之前，因此同步是隐式进行的。

Java 的并发采用的是共享内存模型，Java 线程之间的通信总是隐式进行，整个通信过程对程序员完全透明。如果编写多线程程序的 Java 程序员不理解隐式进行的线程之间通信的工作机制，很可能会遇到各种奇怪的内存可见性问题。

Java 内存模型的抽象

在 java 中，所有实例域、静态域和数组元素存储在堆内存中，堆内存在线程之间共享（本文使用共享变量这个术语代指实例域，静态域和数组元素）。

局部变量（Local variables），方法定义参数（java 语言规范称之为 formal method parameters）和异常处理器参数（exception handler parameters）不会在线程之间共享，它们不会有内存可见性问题，也不受内存模型的影响。

Java 线程之间的通信由 Java 内存模型（本文简称为 JMM）控制，JMM 决定一个线程对共享变量的写入何时对另一个线程可见。从抽象的角度来看，JMM 定义了线程和主内存之间的抽象关系：

线程之间的共享变量存储在主内存（main memory）中，每个线程都有一个私有的本地内存（local memory），本地内存中存储了该线程以读/写共享变量的副本。
本地内存是 JMM 的一个抽象概念，并不真实存在。它涵盖了缓存，写缓冲区，寄存器以及其他的硬件和编译器优化。

Java 内存模型的抽象示意图如下：

从上图来看，线程 A 与线程 B 之间如要通信的话，必须要经历下面 2 个步骤：

首先，线程 A 把本地内存 A 中更新过的共享变量刷新到主内存中去。
然后，线程 B 到主内存中去读取线程 A 之前已更新过的共享变量。

下面通过示意图来说明这两个步骤：

如上图所示，本地内存 A 和 B 有主内存中共享变量 x 的副本。假设初始时，这三个内存中的 x 值都为 0。线程 A 在执行时，把更新后的 x 值（假设值为 1）临时存放在自己的本地内存 A 中。当线程 A 和线程 B 需要通信时，线程 A 首先会把自己本地内存中修改后的 x 值刷新到主内存中，此时主内存中的 x 值变为了 1。随后，线程 B 到主内存中去读取线程 A 更新后的 x 值，此时线程 B 的本地内存的 x 值也变为了 1。

从整体来看，这两个步骤实质上是线程 A 在向线程 B 发送消息，而且这个通信过程必须要经过主内存。JMM 通过控制主内存与每个线程的本地内存之间的交互，来为 java 程序员提供内存可见性保证。

hotspot source code - memory types and generation gc

Posted on 2018-08-05 | In java

vm/memory/allocation.hpp 第135行，关于 JVM 内存使用类型的定义，源码如下：

135136137138139140141142143144145146147148149150151152153154155156157158159160

/* * Memory types */enum MemoryType {  // Memory type by sub systems. It occupies lower byte.  mtJavaHeap          = 0x00,  // Java heap  mtClass             = 0x01,  // memory class for Java classes  mtThread            = 0x02,  // memory for thread objects  mtThreadStack       = 0x03,  mtCode              = 0x04,  // memory for generated code  mtGC                = 0x05,  // memory for GC  mtCompiler          = 0x06,  // memory for compiler  mtInternal          = 0x07,  // memory used by VM, but does not belong to                                 // any of above categories, and not used for                                 // native memory tracking  mtOther             = 0x08,  // memory not used by VM  mtSymbol            = 0x09,  // symbol  mtNMT               = 0x0A,  // memory used by native memory tracking  mtClassShared       = 0x0B,  // class data sharing  mtChunk             = 0x0C,  // chunk that holds content of arenas  mtTest              = 0x0D,  // Test type for verifying NMT  mtTracing           = 0x0E,  // memory used for Tracing  mtNone              = 0x0F,  // undefined  mt_number_of_types  = 0x10   // number of memory types (mtDontTrack                                 // is not included as validate type)};

hotspot source code - jvm stack frame

Posted on 2018-08-05 | In java

vm/runtime/frame.hpp 第54行，关于栈帧操作的系列方法定义，源码如下：

54555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390

class CodeBlob;class FrameValues;class vframeArray;// A frame represents a physical stack frame (an activation).  Frames// can be C or Java frames, and the Java frames can be interpreted or// compiled.  In contrast, vframes represent source-level activations,// so that one physical frame can correspond to multiple source level// frames because of inlining.class frame VALUE_OBJ_CLASS_SPEC { private:  // Instance variables:  intptr_t* _sp; // stack pointer (from Thread::last_Java_sp)  address   _pc; // program counter (the next instruction after the call)  CodeBlob* _cb; // CodeBlob that "owns" pc  enum deopt_state {    not_deoptimized,    is_deoptimized,    unknown  };  deopt_state _deopt_state; public:  // Constructors  frame();#ifndef PRODUCT  // This is a generic constructor which is only used by pns() in debug.cpp.  // pns (i.e. print native stack) uses this constructor to create a starting  // frame for stack walking. The implementation of this constructor is platform  // dependent (i.e. SPARC doesn't need an 'fp' argument an will ignore it) but  // we want to keep the signature generic because pns() is shared code.  frame(void* sp, void* fp, void* pc);#endif  // Accessors  // pc: Returns the pc at which this frame will continue normally.  // It must point at the beginning of the next instruction to execute.  address pc() const             { return _pc; }  // This returns the pc that if you were in the debugger you'd see. Not  // the idealized value in the frame object. This undoes the magic conversion  // that happens for deoptimized frames. In addition it makes the value the  // hardware would want to see in the native frame. The only user (at this point)  // is deoptimization. It likely no one else should ever use it.  address raw_pc() const;  void set_pc( address   newpc );  intptr_t* sp() const           { return _sp; }  void set_sp( intptr_t* newsp ) { _sp = newsp; }  CodeBlob* cb() const           { return _cb; }  // patching operations  void   patch_pc(Thread* thread, address pc);  // Every frame needs to return a unique id which distinguishes it from all other frames.  // For sparc and ia32 use sp. ia64 can have memory frames that are empty so multiple frames  // will have identical sp values. For ia64 the bsp (fp) value will serve. No real frame  // should have an id() of NULL so it is a distinguishing value for an unmatchable frame.  // We also have relationals which allow comparing a frame to anoth frame's id() allow  // us to distinguish younger (more recent activation) from older (less recent activations)  // A NULL id is only valid when comparing for equality.  intptr_t* id(void) const;  bool is_younger(intptr_t* id) const;  bool is_older(intptr_t* id) const;  // testers  // Compares for strict equality. Rarely used or needed.  // It can return a different result than f1.id() == f2.id()  bool equal(frame other) const;  // type testers  bool is_interpreted_frame()    const;  bool is_java_frame()           const;  bool is_entry_frame()          const;             // Java frame called from C?  bool is_stub_frame()           const;  bool is_ignored_frame()        const;  bool is_native_frame()         const;  bool is_runtime_frame()        const;  bool is_compiled_frame()       const;  bool is_safepoint_blob_frame() const;  bool is_deoptimized_frame()    const;  // testers  bool is_first_frame() const; // oldest frame? (has no sender)  bool is_first_java_frame() const;              // same for Java frame  bool is_interpreted_frame_valid(JavaThread* thread) const;       // performs sanity checks on interpreted frames.  // tells whether this frame is marked for deoptimization  bool should_be_deoptimized() const;  // tells whether this frame can be deoptimized  bool can_be_deoptimized() const;  // returns the frame size in stack slots  int frame_size(RegisterMap* map) const;  // returns the sending frame  frame sender(RegisterMap* map) const;  // for Profiling - acting on another frame. walks sender frames  // if valid.  frame profile_find_Java_sender_frame(JavaThread *thread);  bool safe_for_sender(JavaThread *thread);  // returns the sender, but skips conversion frames  frame real_sender(RegisterMap* map) const;  // returns the the sending Java frame, skipping any intermediate C frames  // NB: receiver must not be first frame  frame java_sender() const; private:  // Helper methods for better factored code in frame::sender  frame sender_for_compiled_frame(RegisterMap* map) const;  frame sender_for_entry_frame(RegisterMap* map) const;  frame sender_for_interpreter_frame(RegisterMap* map) const;  frame sender_for_native_frame(RegisterMap* map) const;  // All frames:  // A low-level interface for vframes: public:  intptr_t* addr_at(int index) const             { return &fp()[index];    }  intptr_t  at(int index) const                  { return *addr_at(index); }  // accessors for locals  oop obj_at(int offset) const                   { return *obj_at_addr(offset);  }  void obj_at_put(int offset, oop value)         { *obj_at_addr(offset) = value; }  jint int_at(int offset) const                  { return *int_at_addr(offset);  }  void int_at_put(int offset, jint value)        { *int_at_addr(offset) = value; }  oop*      obj_at_addr(int offset) const        { return (oop*)     addr_at(offset); }  oop*      adjusted_obj_at_addr(Method* method, int index) { return obj_at_addr(adjust_offset(method, index)); } private:  jint*    int_at_addr(int offset) const         { return (jint*)    addr_at(offset); } public:  // Link (i.e., the pointer to the previous frame)  intptr_t* link() const;  void set_link(intptr_t* addr);  // Return address  address  sender_pc() const;  // Support for deoptimization  void deoptimize(JavaThread* thread);  // The frame's original SP, before any extension by an interpreted callee;  // used for packing debug info into vframeArray objects and vframeArray lookup.  intptr_t* unextended_sp() const;  // returns the stack pointer of the calling frame  intptr_t* sender_sp() const;  // Returns the real 'frame pointer' for the current frame.  // This is the value expected by the platform ABI when it defines a  // frame pointer register. It may differ from the effective value of  // the FP register when that register is used in the JVM for other  // purposes (like compiled frames on some platforms).  // On other platforms, it is defined so that the stack area used by  // this frame goes from real_fp() to sp().  intptr_t* real_fp() const;  // Deoptimization info, if needed (platform dependent).  // Stored in the initial_info field of the unroll info, to be used by  // the platform dependent deoptimization blobs.  intptr_t *initial_deoptimization_info();  // Interpreter frames: private:  intptr_t** interpreter_frame_locals_addr() const;  intptr_t*  interpreter_frame_bcx_addr() const;  intptr_t*  interpreter_frame_mdx_addr() const; public:  // Locals  // The _at version returns a pointer because the address is used for GC.  intptr_t* interpreter_frame_local_at(int index) const;  void interpreter_frame_set_locals(intptr_t* locs);  // byte code index/pointer (use these functions for unchecked frame access only!)  intptr_t interpreter_frame_bcx() const                  { return *interpreter_frame_bcx_addr(); }  void interpreter_frame_set_bcx(intptr_t bcx);  // byte code index  jint interpreter_frame_bci() const;  void interpreter_frame_set_bci(jint bci);  // byte code pointer  address interpreter_frame_bcp() const;  void    interpreter_frame_set_bcp(address bcp);  // Unchecked access to the method data index/pointer.  // Only use this if you know what you are doing.  intptr_t interpreter_frame_mdx() const                  { return *interpreter_frame_mdx_addr(); }  void interpreter_frame_set_mdx(intptr_t mdx);  // method data pointer  address interpreter_frame_mdp() const;  void    interpreter_frame_set_mdp(address dp);  // Find receiver out of caller's (compiled) argument list  oop retrieve_receiver(RegisterMap *reg_map);  // Return the monitor owner and BasicLock for compiled synchronized  // native methods so that biased locking can revoke the receiver's  // bias if necessary.  This is also used by JVMTI's GetLocalInstance method  // (via VM_GetReceiver) to retrieve the receiver from a native wrapper frame.  BasicLock* get_native_monitor();  oop        get_native_receiver();  // Find receiver for an invoke when arguments are just pushed on stack (i.e., callee stack-frame is  // not setup)  oop interpreter_callee_receiver(Symbol* signature)     { return *interpreter_callee_receiver_addr(signature); }  oop* interpreter_callee_receiver_addr(Symbol* signature);  // expression stack (may go up or down, direction == 1 or -1) public:  intptr_t* interpreter_frame_expression_stack() const;  static  jint  interpreter_frame_expression_stack_direction();  // The _at version returns a pointer because the address is used for GC.  intptr_t* interpreter_frame_expression_stack_at(jint offset) const;  // top of expression stack  intptr_t* interpreter_frame_tos_at(jint offset) const;  intptr_t* interpreter_frame_tos_address() const;  jint  interpreter_frame_expression_stack_size() const;  intptr_t* interpreter_frame_sender_sp() const;#ifndef CC_INTERP  // template based interpreter deoptimization support  void  set_interpreter_frame_sender_sp(intptr_t* sender_sp);  void interpreter_frame_set_monitor_end(BasicObjectLock* value);#endif // CC_INTERP  // Address of the temp oop in the frame. Needed as GC root.  oop* interpreter_frame_temp_oop_addr() const;  // BasicObjectLocks:  //  // interpreter_frame_monitor_begin is higher in memory than interpreter_frame_monitor_end  // Interpreter_frame_monitor_begin points to one element beyond the oldest one,  // interpreter_frame_monitor_end   points to the youngest one, or if there are none,  //                                 it points to one beyond where the first element will be.  // interpreter_frame_monitor_size  reports the allocation size of a monitor in the interpreter stack.  //                                 this value is >= BasicObjectLock::size(), and may be rounded up  BasicObjectLock* interpreter_frame_monitor_begin() const;  BasicObjectLock* interpreter_frame_monitor_end()   const;  BasicObjectLock* next_monitor_in_interpreter_frame(BasicObjectLock* current) const;  BasicObjectLock* previous_monitor_in_interpreter_frame(BasicObjectLock* current) const;  static int interpreter_frame_monitor_size();  void interpreter_frame_verify_monitor(BasicObjectLock* value) const;  // Tells whether the current interpreter_frame frame pointer  // corresponds to the old compiled/deoptimized fp  // The receiver used to be a top level frame  bool interpreter_frame_equals_unpacked_fp(intptr_t* fp);  // Return/result value from this interpreter frame  // If the method return type is T_OBJECT or T_ARRAY populates oop_result  // For other (non-T_VOID) the appropriate field in the jvalue is populated  // with the result value.  // Should only be called when at method exit when the method is not  // exiting due to an exception.  BasicType interpreter_frame_result(oop* oop_result, jvalue* value_result); public:  // Method & constant pool cache  Method* interpreter_frame_method() const;  void interpreter_frame_set_method(Method* method);  Method** interpreter_frame_method_addr() const;  ConstantPoolCache** interpreter_frame_cache_addr() const; public:  // Entry frames  JavaCallWrapper* entry_frame_call_wrapper() const { return *entry_frame_call_wrapper_addr(); }  JavaCallWrapper* entry_frame_call_wrapper_if_safe(JavaThread* thread) const;  JavaCallWrapper** entry_frame_call_wrapper_addr() const;  intptr_t* entry_frame_argument_at(int offset) const;  // tells whether there is another chunk of Delta stack above  bool entry_frame_is_first() const;  // Compiled frames: public:  // Given the index of a local, and the number of argument words  // in this stack frame, tell which word of the stack frame to find  // the local in.  Arguments are stored above the ofp/rpc pair,  // while other locals are stored below it.  // Since monitors (BasicLock blocks) are also assigned indexes,  // but may have different storage requirements, their presence  // can also affect the calculation of offsets.  static int local_offset_for_compiler(int local_index, int nof_args, int max_nof_locals, int max_nof_monitors);  // Given the index of a monitor, etc., tell which word of the  // stack frame contains the start of the BasicLock block.  // Note that the local index by convention is the __higher__  // of the two indexes allocated to the block.  static int monitor_offset_for_compiler(int local_index, int nof_args, int max_nof_locals, int max_nof_monitors);  // Tell the smallest value that local_offset_for_compiler will attain.  // This is used to help determine how much stack frame to allocate.  static int min_local_offset_for_compiler(int nof_args, int max_nof_locals, int max_nof_monitors);  // Tells if this register must be spilled during a call.  // On Intel, all registers are smashed by calls.  static bool volatile_across_calls(Register reg);

hotspot source code - java method calls

Posted on 2018-08-05 | In java

vm/runtime/javaCalls.hpp 第179行，开始定义JavaCalls::call()系列方法，源码如下：

179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218

// All calls to Java have to go via JavaCalls. Sets up the stack frame// and makes sure that the last_Java_frame pointers are chained correctly.//class JavaCalls: AllStatic {  static void call_helper(JavaValue* result, methodHandle* method, JavaCallArguments* args, TRAPS); public:  // Optimized Constuctor call  static void call_default_constructor(JavaThread* thread, methodHandle method, Handle receiver, TRAPS);  // call_special  // ------------  // The receiver must be first oop in argument list  static void call_special(JavaValue* result, KlassHandle klass, Symbol* name, Symbol* signature, JavaCallArguments* args, TRAPS);  static void call_special(JavaValue* result, Handle receiver, KlassHandle klass, Symbol* name, Symbol* signature, TRAPS); // No args  static void call_special(JavaValue* result, Handle receiver, KlassHandle klass, Symbol* name, Symbol* signature, Handle arg1, TRAPS);  static void call_special(JavaValue* result, Handle receiver, KlassHandle klass, Symbol* name, Symbol* signature, Handle arg1, Handle arg2, TRAPS);  // virtual call  // ------------  // The receiver must be first oop in argument list  static void call_virtual(JavaValue* result, KlassHandle spec_klass, Symbol* name, Symbol* signature, JavaCallArguments* args, TRAPS);  static void call_virtual(JavaValue* result, Handle receiver, KlassHandle spec_klass, Symbol* name, Symbol* signature, TRAPS); // No args  static void call_virtual(JavaValue* result, Handle receiver, KlassHandle spec_klass, Symbol* name, Symbol* signature, Handle arg1, TRAPS);  static void call_virtual(JavaValue* result, Handle receiver, KlassHandle spec_klass, Symbol* name, Symbol* signature, Handle arg1, Handle arg2, TRAPS);  // Static call  // -----------  static void call_static(JavaValue* result, KlassHandle klass, Symbol* name, Symbol* signature, JavaCallArguments* args, TRAPS);  static void call_static(JavaValue* result, KlassHandle klass, Symbol* name, Symbol* signature, TRAPS);  static void call_static(JavaValue* result, KlassHandle klass, Symbol* name, Symbol* signature, Handle arg1, TRAPS);  static void call_static(JavaValue* result, KlassHandle klass, Symbol* name, Symbol* signature, Handle arg1, Handle arg2, TRAPS);  // Low-level interface  static void call(JavaValue* result, methodHandle method, JavaCallArguments* args, TRAPS);};

hotspot source code - object monitor

Posted on 2018-08-04 | In java

vm/runtime/objectMonitor.hpp 第61行，定义ObjectMonitor，源码如下：

616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112

// WARNING://   This is a very sensitive and fragile class. DO NOT make any// change unless you are fully aware of the underlying semantics.//   This class can not inherit from any other class, because I have// to let the displaced header be the very first word. Otherwise I// have to let markOop include this file, which would export the// monitor data structure to everywhere.//// The ObjectMonitor class is used to implement JavaMonitors which have// transformed from the lightweight structure of the thread stack to a// heavy weight lock due to contention// It is also used as RawMonitor by the JVMTIclass ObjectMonitor { public:  enum {    OM_OK,                    // no error    OM_SYSTEM_ERROR,          // operating system error    OM_ILLEGAL_MONITOR_STATE, // IllegalMonitorStateException    OM_INTERRUPTED,           // Thread.interrupt()    OM_TIMED_OUT              // Object.wait() timed out  }; public:  // TODO-FIXME: the "offset" routines should return a type of off_t instead of int ...  // ByteSize would also be an appropriate type.  static int header_offset_in_bytes()      { return offset_of(ObjectMonitor, _header);     }  static int object_offset_in_bytes()      { return offset_of(ObjectMonitor, _object);     }  static int owner_offset_in_bytes()       { return offset_of(ObjectMonitor, _owner);      }  static int count_offset_in_bytes()       { return offset_of(ObjectMonitor, _count);      }  static int recursions_offset_in_bytes()  { return offset_of(ObjectMonitor, _recursions); }  static int cxq_offset_in_bytes()         { return offset_of(ObjectMonitor, _cxq) ;       }  static int succ_offset_in_bytes()        { return offset_of(ObjectMonitor, _succ) ;      }  static int EntryList_offset_in_bytes()   { return offset_of(ObjectMonitor, _EntryList);  }  static int FreeNext_offset_in_bytes()    { return offset_of(ObjectMonitor, FreeNext);    }  static int WaitSet_offset_in_bytes()     { return offset_of(ObjectMonitor, _WaitSet) ;   }  static int Responsible_offset_in_bytes() { return offset_of(ObjectMonitor, _Responsible);}  static int Spinner_offset_in_bytes()     { return offset_of(ObjectMonitor, _Spinner);    } public:  // Eventaully we'll make provisions for multiple callbacks, but  // now one will suffice.  static int (*SpinCallbackFunction)(intptr_t, int) ;  static intptr_t SpinCallbackArgument ; public:  markOop   header() const;  void      set_header(markOop hdr);

yuweijun

RSS

GitHub Twitter