当前位置：首页 > news >正文

Effective C# 第二章：.Net资源管理(翻译)

news 来源：原创 2024/5/3 9:39:10

Chapter 2. .NET Resource Management
第二章：.Net资源管理

一个简单的事实：.Net应用程序是在一个托管的环境里运行的，这个环境和不同的设计器有很大的冲突，这就才有了Effective C#。极大限度上的讨论这个环境的好处，须要把你对本地化环境的想法改变为.Net CLR。也就意味着要明白.Net的垃圾回收器。在你明白这一章里所推荐的内容时，有必要对.Net的内存管理环境有个大概的了解。那我们就开始大概的了解一下吧。

垃圾回收器(GC)为你控制托管内存。不像本地运行环境，你不用负责对内存泄漏，不定指针，未初始化指针，或者一个其它内存管理的服务问题。但垃圾回收器前不是一个神话：你一样要自己清理。你要对非托管资源负责，例如文件句柄，数据链接，GDI+对象，COM对象，以及其它一些系统对象。

这有一个好消息：因为GC管理内存，明确的设计风格可以更容易的实现。循环引用，不管是简单关系还是复杂的网页对象，都非常容易。GC的标记以及严谨的高效算法可以检测到这些关系，并且完全的删除不可达的网页对象。GC是通过对从应用程序的根对象开始，通过树形结构的“漫游”来断定一个对象是否可达的，而不是强迫每个对象都保持一些引用跟踪，COM就是这样的。DataSet就是一个很好的例子，展示了这样的算法是如何简化并决定对象的所属关系的。DataSet是一个DataTable的集合，而每一个DataTable又是DataRow的集合，每一个DataRow又是DataItem的集合，DataColum定义了这些类型的关系。这里就有一些从DataItem到它的列的引用。而同时，DataTime也同样有一个引用到它的容器上，也就是DataRow。DataRow包含引用到DataTable，最后每个对象都包含一个引用到DataSet。
(译注：作者这里是想说：你看，这么复杂的引用关系，GC都可以轻松的搞定，你看GC是不是很强大？)

如果这还不够复杂，那可以创建一个DataView，它提供对经过过滤后的数据表的顺序访问。这些都是由DataViewManager管理的。所有这些贯穿网页的引用构成了DataSet。释放内存是GC的责任。因为.Net框架的设计者让你不必释放这些对象，这些复杂的网页对象引用不会造成问题。没有必须关心这些网页对象的合适的释放顺序，这是GC的工作。GC的设计结构可以简化这些问题，它可以识别这些网页对象就是垃圾。在应用程序结束了对DataSet的引用后，没有人可以引用到它的子对象了(译注：就是DataSet里的对象再也引用不到了)。因此，网页里还有没有对象循环引用DataSet，DataTables已经一点也不重要了，因为这些对象在应用程序都已经不能被访问到了，它们是垃圾了。

垃圾回收器在它独立的线程上运行，用来从你的程序里移除不使用的内存。而且在每次运行时，它还会压缩托管堆。压缩堆就是把托管堆中活动的对象移到一起，这样就可以空出连续的内存。图2.1展示了两个没有进行垃圾回收时的内存快照。所有的空闲内存会在垃圾回收进行后连续起来。

effect csharp 2.1.JPG
图2.1 垃圾回收器不仅仅是移动不使用的内存，还移除动其它的对象，从而压缩使用的内存，让出最多的空闲内存。

正如你刚开始了解的，垃圾回收器的全部责任就是内存管理。但，所有的系统资源都是你自己负责的。你可以通过给自己的类型定义一个析构函数，来保证释放一些系统资源。析构函数是在垃圾回收器把对象从内存移除前，由系统调用的。你可以，也必须这样来释放任何你所占用的非托管资源。对象的析构函数有时是在对象成为垃圾之后调用的，但是在内存归还之前。这个非确定的析构函数意味着在你无法控制对象析构与停止使用之间的关系(译注：对象的析构与对象的无法引用是两个完全不同的概念。关于GC，本人推荐读者参考一下Jeffrey的".Net框架程序设计(修订版)"中讨论的垃圾回收器)。对C++来说这是个重大的改变，并且这在设计上有一个重大的分歧。有经验的C++程序员写的类总在构造函数内申请内存并且在析构函数中释放它们：
// Good C++, bad C#:
class CriticalSection
{
public:
// Constructor acquires the system resource.
CriticalSection( )
{
EnterCriticalSection( );
}

// Destructor releases system resource.
~CriticalSection( )
{
ExitCriticalSection( );
}
};

// usage:
void Func( )
{
// The lifetime of s controls access to
// the system resource.
CriticalSection s;
// Do work.

//...

// compiler generates call to destructor.
// code exits critical section.
}

这是一种很常见的C++风格，它保证资源无异常的释放。但这在C#里不工作，至少，与这不同。明确的析构函数不是.Net环境或者C#的一部份。强行用C++的风格在C#里使用析构函数不会让它正常的工作。在C#里，析构函数确实是正确的运行了，但它不是即时运行的。在前面那个例子里，代码最终在critical section上，但在C#里，当析构函数存在时，它并不是在critical section上。它会在后面的某个未知时间上运行。你不知道是什么时候，你也无法知道是什么时候。

依懒于析构函数同样会导致性能上的损失。须要析构的对象在垃圾回收器上放置了一剂性能毒药。当GC发现某个对象是垃圾但是须要析构时，它还不能直接从内存上删除这个对象。首先，它要调用析构函数，但析构函数的调用不是在垃圾回收器的同一个线程上运行的。取而代之的是，GC不得不把对象放置到析构队列中，让另一个线程让执行所有的析构函数。GC继续它自己的工作，从内存上移除其它的垃圾。在下一个GC回收时，那些被析构了的对象才会再从内存上移除。图2.2展示了三个内存使用不同的GC情况。注意，那些须要析构的对象会待在内存里，直到下一次GC回收。

effect csharp 2.2.JPG
图2.2 这个顺序展示了析构函数在垃圾回收器上起的作用。对象会在内存里存在的时间更长，须要启动另一个线程来运行垃圾回收器。

这用使你相信：那些须要析构的对象在内存至少多生存一个GC回收循环。但，我是简化了这些事。实际上，因为另一个GC的介入(译注：其实只有一个GC，作者是想引用回收代的问题。)，使得情况比这复杂得多。.Net回收器采用”代“来优化这个问题。代可以帮助GC来很快的标识那些看上去看是垃圾的对象。所以从上一次回后开始创建的对象称为第0代对象，所有那些经过一次GC回收后还存在的对象称为第1代对象。所有那些经过2次或者2次以上GC回收后还存在的对象称为第2代对象(译注：因为目前GC只支持3代对象，第0代到第2代，所以最多只有第2代对象，如果今后GC支持更多的代，那么会出现更代的对象，.Net 1.1与2.0都只支持3代，这是MS证实比较合理的数字)。

分代的目的就是用来区分临时变量以及一些应用程序的全局变量。第0代对象很可能是临时的变量。成员变量，以及一些全局变量很快会成为第1代对象，最终成为第2代对象。

GC通过限制检测第1以及第2代对象来优化它的工作。每个GC循环都检测第0代对象。粗略假设个GC会超过10次检测来检测第0代对象，而要超过100次来检测所有对象。再次考虑析构函数的开销：一个须要析构函数的对象可能要比一个不用析构函数的对象在内存里多待上9个GC回收循环。如果它还没有被析构，它将会移到第2代对象。在第2代对象中，一个可以生存上100个GC循环直到下一个第2代集合(译注：没理解，不知道说的什么)。

结束时，记得一个垃圾回收器负责内存管理的托管环境的最大好处：内存泄漏，其它指针的服务问题不在是你的问题。非内存资源迫使你要使用析构函数来确保清理非内存资源。析构函数会对你的应用程序性能产生一些影响，但你必须使用它们来防止资源泄漏(译注：请注意理解非内存资源是什么，一般是指文件句柄，网络资源，或者其它不能在内存中存放的资源)。通过实现IDisposable接口来避免析构函数在垃圾回收器上造成的性能损失。接下来的具体的原则将会帮助你更有效的使用环境来开发程序。

Chapter 2. .NET Resource Management
The simple fact that .NET programs run in a managed environment has a big impact on the kinds of designs that create effective C#. Taking utmost advantage of that environment requires changing your thinking from native environments to the .NET CLR. It means understanding the .NET Garbage Collector. An overview of the .NET memory management environment is necessary to understand the specific recommendations in this chapter, so let's get on with the overview.

The Garbage Collector (GC) controls managed memory for you. Unlike native environments, you are not responsible for memory leaks, dangling pointers, uninitialized pointers, or a host of other memory-management issues. But the Garbage Collector is not magic: You need to clean up after yourself, too. You are responsible for unmanaged resources such as file handles, database connections, GDI+ objects, COM objects, and other system objects.

Here's the good news: Because the GC controls memory, certain design idioms are much easier to implement. Circular references, both simple relationships and complex webs of objects, are much easier. The GC's Mark and Compact algorithm efficiently detects these relationships and removes unreachable webs of objects in their entirety. The GC determines whether an object is reachable by walking the object tree from the application's root object instead of forcing each object to keep track of references to it, as in COM. The DataSet class provides an example of how this algorithm simplifies object ownership decisions. A DataSet is a collection of DataTables. Each DataTable is a collection of DataRows. Each DataRow is a collection of DataItems. Each DataTable also contains a collection of DataColumns. DataColumns define the types associated with each column of data. There are other references from the DataItems to its appropriate column. Every DataItem also contains a reference to its container, the DataRow. DataRows contain references back to the DataTable, and everything contains a reference back to the containing DataSet.

If that's not complicated enough, you can create DataViews that provide access to filtered sequences of a data table. Those are all managed by a DataViewManager. There are references all through the web of objects that make up a DataSet. Releasing memory is the GC's responsibility. Because the .NET Framework designers did not need to free these objects, the complicated web of object references did not pose a problem. No decision needed to be made regarding the proper sequence of freeing this web of objects; it's the GC's job. The GC's design simplifies the problem of identifying this kind of web of objects as garbage. After the application releases its reference to the dataset, none of the subordinate objects can be reached. It does not matter that there are still circular references to the DataSet, DataTables, and other objects in the web. Because these objects cannot be reached from the application, they are all garbage.

The Garbage Collector runs in its own thread to remove unused memory from your program. It also compacts the managed heap each time it runs. Compacting the heap moves each live object in the managed heap so that the free space is located in one contiguous block of memory. Figure 2.1 shows two snapshots of the heap before and after a garbage collection. All free memory is placed in one contiguous block after each GC operation.

effect csharp 2.1.JPG
Figure 2.1. The Garbage Collector not only removes unused memory, but it moves other objects in memory to compact used memory and maximize free space.

As you've just learned, memory management is completely the responsibility of the Garbage Collector. All other system resources are your responsibility. You can guarantee that you free other system resources by defining a finalizer in your type. Finalizers are called by the system before an object that is garbage is removed from memory. You canand mustuse these methods to release any unmanaged resources that an object owns. The finalizer for an object is called at some time after it becomes garbage and before the system reclaims its memory. This nondeterministic finalization means that you cannot control the relationship between when you stop using an object and when its finalizer executes. That is a big change from C++, and it has important ramifications for your designs. Experienced C++ programmers wrote classes that allocated a critical resource in its constructor and released it in its destructor:

// Good C++, bad C#:
class CriticalSection
{
public:
// Constructor acquires the system resource.
CriticalSection( )
{
EnterCriticalSection( );
}

// Destructor releases system resource.
~CriticalSection( )
{
ExitCriticalSection( );
}
};

// usage:
void Func( )
{
// The lifetime of s controls access to
// the system resource.
CriticalSection s;
// Do work.

//...

// compiler generates call to destructor.
// code exits critical section.
}

This common C++ idiom ensures that resource deallocation is exception-proof. This doesn't work in C#, howeverat least, not in the same way. Deterministic finalization is not part of the .NET environment or the C# language. Trying to force the C++ idiom of deterministic finalization into the C# language won't work well. In C#, the finalizer eventually executes, but it doesn't execute in a timely fashion. In the previous example, the code eventually exits the critical section, but, in C#, it doesn't exit the critical section when the function exits. That happens at some unknown time later. You don't know when. You can't know when.

Relying on finalizers also introducesperformance penalties. Objects that require finalization put a performance drag on the Garbage Collector. When the GC finds that an object is garbage but also requires finalization, it cannot remove that item from memory just yet. First, it calls the finalizer. Finalizers are not executed by the same thread that collects garbage. Instead, the GC places each object that is ready for finalization in a queue and spawns yet another thread to execute all the finalizers. It continues with its business, removing other garbage from memory. On the next GC cycle, those objects that have been finalized are removed from memory. Figure 2.2 shows three different GC operations and the difference in memory usage. Notice that the objects that require finalizers stay in memory for extra cycles.

effect csharp 2.2.JPG
Figure 2.2. This sequence shows the effect of finalizers on the Garbage Collector. Objects stay in memory longer, and an extra thread needs to be spawned to run the Garbage Collector.

This might lead you to believe that an object that requires finalization lives in memory for one GC cycle more than necessary. But I simplified things. It's more complicated than that because of another GC design decision. The .NET Garbage Collector defines generations to optimize its work. Generations help the GC identify the likeliest garbage candidates more quickly. Any object created since the last garbage collection operation is a generation 0 object. Any object that has survived one GC operation is a generation 1 object. Any object that has survived two or more GC operations is a generation 2 object. The purpose of generations is to separate local variables and objects that stay around for the life of the application. Generation 0 objects are mostly local variables. Member variables and global variables quickly enter generation 1 and eventually enter generation 2.

The GC optimizes its work by limiting how often it examines first- and second-generation objects. Every GC cycle examines generation 0 objects. Roughly 1 GC out of 10 examines the generation 0 and 1 objects. Roughly 1 GC cycle out of 100 examines all objects. Think about finalization and its cost again: An object that requires finalization might stay in memory for nine GC cycles more than it would if it did not require finalization. If it still has not been finalized, it moves to generation 2. In generation 2, an object lives for an extra 100 GC cycles until the next generation 2 collection.

To close, remember that a managed environment, where the Garbage Collector takes the responsibility for memory management, is a big plus: Memory leaks and a host of other pointer-related problems are no longer your problem. Nonmemory resources force you to create finalizers to ensure proper cleanup of those nonmemory resources. Finalizers can have a serious impact on the performance of your program, but you must write them to avoid resource leaks. Implementing and using the IDisposable interface avoids the performance drain on the Garbage Collector that finalizers introduce. The next section moves on to the specific items that will help you create programs that use this environment more effectively.