2019独角兽企业重金招聘Python工程师标准>>>
1. 前言
关于书中String.intern()返回引用的测试,代码如下:
public static void main(String[] args) {
String str1 = new StringBuilder("计算机").append("软件").toString();
System.out.println(str1.intern() == str1);
String str2 = new StringBuilder("ja").append("va").toString();
System.out.println(str2.intern() == str2);
}
它是这么解释的:
这段代码在JDK 1.6中运行,会得到两个false,而在JDK 1.7中运行,会得到一个true和一个false。产生差异的原因是:在JDK 1.6中,intern()会把首次遇到的字符串实例复制到永久代中,返回的也是永久代中这个字符串实例的引用,而由StringBuilder创建的字符串实例在Java堆上,所以必然不是同一个引用,将返回false。而JDK 1.7的intern()实现不会复制实例,只是在常量池中记录首次出现的实例引用,因此intern()返回的引用和由StringBuilder创建的那个字符串实例是同一个。对str2比较返回false是因为“java”这个字符串在执行StringBuilder.toString()之前已经出现过,字符串常量池中已经有它的引用了,不符合“首次出现”的原则,而“计算机软件”这个字符串则是首次出现的,因此返回true。
2. 疑问
相信很多人看了这个解释后,最大的疑问不是JDK 1.6、1.7intern()方法的差异(这个详细的讲解我们等下再说),而是为啥“java”这个字符串已经出现过了?
思考了下,既然我们代码中没有去设置,那就是JDK内部帮我们去做了。为了验证我们的猜测是否正确,我们来看下JDK在初始化的时候做了啥。
System类源码:
/**
* The <code>System</code> class contains several useful class fields
* and methods. It cannot be instantiated.
*
* <p>Among the facilities provided by the <code>System</code> class
* are standard input, standard output, and error output streams;
* access to externally defined properties and environment
* variables; a means of loading files and libraries; and a utility
* method for quickly copying a portion of an array.
*
* @author unascribed
* @since JDK1.0
*/
public final class System {
/* register the natives via the static initializer.
*
* VM will invoke the initializeSystemClass method to complete
* the initialization for this class separated from clinit.
* Note that to use properties set by the VM, see the constraints
* described in the initializeSystemClass method.
*/
private static native void registerNatives();
...
可以看出,虚拟机是通过调用initializeSystemClass方法来完成System类的初始化工作的。
/**
* Initialize the system class. Called after thread initialization.
*/
private static void initializeSystemClass() {
// VM might invoke JNU_NewStringPlatform() to set those encoding
// sensitive properties (user.home, user.name, boot.class.path, etc.)
// during "props" initialization, in which it may need access, via
// System.getProperty(), to the related system encoding property that
// have been initialized (put into "props") at early stage of the
// initialization. So make sure the "props" is available at the
// very beginning of the initialization and all system properties to
// be put into it directly.
props = new Properties();
initProperties(props); // initialized by the VM
// There are certain system configurations that may be controlled by
// VM options such as the maximum amount of direct memory and
// Integer cache size used to support the object identity semantics
// of autoboxing. Typically, the library will obtain these values
// from the properties set by the VM. If the properties are for
// internal implementation use only, these properties should be
// removed from the system properties.
//
// See java.lang.Integer.IntegerCache and the
// sun.misc.VM.saveAndRemoveProperties method for example.
//
// Save a private copy of the system properties object that
// can only be accessed by the internal implementation. Remove
// certain system properties that are not intended for public access.
sun.misc.VM.saveAndRemoveProperties(props);
lineSeparator = props.getProperty("line.separator");
sun.misc.Version.init();
...
在initializeSystemClass方法中发现调用了Version对象的init静态方法。
public class Version {
private static final String launcher_name = "java";
private static final String java_version = "1.8.0_171";
private static final String java_runtime_name = "Java(TM) SE Runtime Environment";
private static final String java_profile_name = "";
private static final String java_runtime_version = "1.8.0_171-b11";
...
public Version() {
}
public static void init() {
System.setProperty("java.version", "1.8.0_171");
System.setProperty("java.runtime.version", "1.8.0_171-b11");
System.setProperty("java.runtime.name", "Java(TM) SE Runtime Environment");
}
...
而Version类里laucher_name是私有静态字符串常量。
因此sun.misc.Version类会在JDK类库的初始化过程中被加载并初始化,而在初始化时它需要对静态常量字段根据指定的常量值(ConstantValue)做默认初始化,此时被sun.misc.Version.launcher静态常量字段所引用的"java"字符串字面量就被intern到HotSpotVM的字符串常量池——StringTable里了。
假如我们将
String str2 = new StringBuilder("ja").append("va").toString();
改成
String str2 = new StringBuilder("1.8.0_").append("171").toString();
或
String str2 = new StringBuilder("Java(TM) SE Runtime ").append("Environment").toString();
你会发现,结果还是false。从而证实了我们的猜测。
2. 原理
回到上面JDK 1.6、1.7intern()方法的差异,我们来看下看下intern()方法的源码(JDK8):
/**
* Returns a canonical representation for the string object.
* <p>
* A pool of strings, initially empty, is maintained privately by the
* class {@code String}.
* <p>
* When the intern method is invoked, if the pool already contains a
* string equal to this {@code String} object as determined by
* the {@link #equals(Object)} method, then the string from the pool is
* returned. Otherwise, this {@code String} object is added to the
* pool and a reference to this {@code String} object is returned.
* <p>
* It follows that for any two strings {@code s} and {@code t},
* {@code s.intern() == t.intern()} is {@code true}
* if and only if {@code s.equals(t)} is {@code true}.
* <p>
* All literal strings and string-valued constant expressions are
* interned. String literals are defined in section 3.10.5 of the
* <cite>The Java™ Language Specification</cite>.
*
* @return a string that has the same contents as this string, but is
* guaranteed to be from a pool of unique strings.
*/
public native String intern();
可以看到,String.intern()是一个Native方法。
在JDK 1.6及之前的版本中,常量池是分配在永久代内的,而永久代和Java堆是两个完全分开的区域。如果字符串常量池中已经包含一个等于此String对象的字符串,则返回代表池中这个字符串的String对象;否则,将此String对象包含的字符串添加到常量池中,并且返回此String对象的引用。
从JDK 1.7开始“去永久代”,字符串常量池已经被转移至Java堆中,开发人员也对intern 方法做了一些修改。因为字符串常量池和new的对象都存于Java堆中,为了优化性能和减少内存开销,当调用 intern 方法时,如果常量池中已经存在该字符串,则返回池中字符串;否则直接存储堆中的引用,也就是字符串常量池中存储的是指向堆里的对象。
可以参考《深入理解JVM——Java虚拟机内存模型》
我们以上面代码为例子,通过画内存模型图来分析下这里面的原理:
- JDK 1.6及之前版本的内存模型图
可以看出,str1和str2引用在Java虚拟机栈上,而str1.intern()和str2.intern()引用在永久代上,所以都返回false。
- JDK 1.7及以上版本的内存模型图
可以看出,str1和str2引用在Java虚拟机栈上,而str1.intern()返回的是str1引用,str2.intern()返回的引用在常量池上,所以一个返回true,一个返回false。
在Java中,用双引号声明出来的String对象会直接存储在常量池中,如果不是用双引号声明的String对象,可以用String提供的intern方法。intern方法会从字符串常量池中查询当前字符串是否存在,若不存在就会把当前字符串放入常量池中,再返回。
参考
http://baijiahao.baidu.com/s?id=1568390319555291&wfr=spider&for=pc
https://blog.csdn.net/qq_34490018/article/details/82110578