I just had an interview, and I was asked to create a memory leak with Java. Needless to say I felt pretty dumb having no clue on how to even start creating one.
What would an example be?
I just had an interview, and I was asked to create a memory leak with Java. Needless to say I felt pretty dumb having no clue on how to even start creating one. What would an example be? |
|||||||||||||||||||||
|
Here's a good way to create a true memory leak (objects inaccessible by running code but still stored in memory) in pure Java:
This works because the ThreadLocal keeps a reference to the object, which keeps a reference to its Class, which in turn keeps a reference to its ClassLoader. The ClassLoader, in turn, keeps a reference to all the Classes it has loaded. It gets worse because in many JVM implementations Classes and ClassLoaders are allocated straight into permgen and are never GC'd at all. A variation on this pattern is why application containers (like Tomcat) can leak memory like a sieve if you frequently redeploy applications that happen to use ThreadLocals in any way. (Since the application container uses Threads as described, and each time you redeploy the application a new ClassLoader is used.) Update: Since lots of people keep asking for it, here's some example code that shows this behavior in action. |
|||||||||||||||||||||
|
Static field holding object reference [esp final field]
Calling
(Unclosed) open streams ( file , network etc... )
Unclosed connections
Areas that are unreachable from JVM's garbage collector, such as memory allocated through native methods In web applications, some objects are stored in application scope until the application is explicitly stopped or removed.
Incorrect or inappropriate JVM options, such as the See IBM jdk settings. |
|||||||||||||||||||||
|
A simple thing to do is to use a HashSet with an incorrect (or non-existent) If you want these bad keys/elements to hang around you can use a static field like
|
|||||||||||||||||||||
|
Below there will be a non-obvious case where Java leaks, besides the standard case of forgotten listeners, static references, bogus/modifiable keys in hashmaps, or just threads stuck without any chance to end their life-cycle.
I'll concentrate on threads to show the danger of unmanaged threads mostly, don't wish to even touch swing.
(I can add some more time wasters I have encountered upon request.) Good luck and stay safe; leaks are evil! |
|||||||||
|
The answer depends entirely on what the interviewer thought they were asking. Is it possible in practice to make Java leak? Of course it is, and there are plenty of examples in the other answers. But there are multiple meta-questions that may have been being asked?
I'm reading your meta-question as "What's an answer I could have used in this interview situation". And hence, I'm going to focus on interview skills instead of Java. I believe your more likely to repeat the situation of not knowing the answer to a question in an interview than you are to be in a place of needing to know how to make Java leak. So, hopefully, this will help. One of the most important skills you can develop for interviewing is learning to actively listen to the questions and working with the interviewer to extract their intent. Not only does this let you answer their question the way they want, but also shows that you have some vital communication skills. And when it comes down to a choice between many equally talented developers, I'll hire the one who listens, thinks, and understands before they respond every time. |
|||||||||
|
Most examples here are "too complex". They are edge cases. With these examples, the programmer made a mistake (like don't redifining equals/hashcode), or has been bitten by a corner case of the JVM/JAVA (load of class with static...). I think that's not the type of example an interviewer want or even the most common case. But there are really simpler cases for memory leaks. The garbage collector only frees what is no longer referenced. We as Java developpers don't care about memory. We allocate it when needed and let it be freed automatically. Fine. But any long-lived application tend to have shared state. It can be anything, statics, singletons... Often non-trivial applications tend to make complex objects graphs. Just forgetting to set a reference to null or more often forgetting to remove one object from a collection is enough to make a memory leak. Of course all sort of listeners (like UI listeners), caches, or any long-lived shared state tend to produce memory leak if not properly handled. What shall be understood is that this is not a Java corner case, or a problem with the garbage collector. It is a design problem. We design that we add a listener to a long-lived object, but we don't remove the listener when no longer needed. We cache objects, but we have no strategy to remove them from the cache. We maybe have a complex graph that store the previous state that is needed by a computation. But the previous state is itself linked to the state before and so on. Like we have to close SQL connections or files. We need to set proper referenres to null and remove elements from the collection. We shall have proper caching strategies (maximum memory size, number of elements, or timers). All objects that allow a listener to be notified must provide both a addListener and removeListener method. And when these notifiers are no longer used, they must clear their listener list. A memory leak is indeed truly possible and is perfectly predictible. No need for special language features or corner cases. Memory leaks are either an indicator that something is maybe missing or even of design problems. |
|||||||||||||||||||||
|
The following is a pretty pointless example, if you do not understand JDBC. Or at least how JDBC expects a developer to close
The problem with the above is that the In such an event where the Even if the JDBC driver were to implement The above scenario of encountering exceptions during object finalization is related to another other scenario that could possibly lead to a memory leak - object resurrection. Object resurrection is often done intentionally by creating a strong reference to the object from being finalized, from another object. When object resurrection is misused it will lead to a memory leak in combination with other sources of memory leaks. There are plenty more examples that you can conjure up - like
|
|||||||||||||||||
|
Probably one of the simplest examples of a potential memory leak, and how to avoid it, is the implementation of ArrayList.remove(int):
If you were implementing it yourself, would you have thought to clear the array element that is no longer used ( |
|||||||||||||||||||||
|
Any time you keep references around to objects that you no longer need you have a memory leak. See Handling memory leaks in Java programs for examples of how memory leaks manifest themselves in Java and what you can do about it. |
|||||||||||||||||||||
|
You are able to make memory leak with sun.misc.Unsafe class. In fact this service class is used in different standard classes (for example in java.nio classes). You can't create instance of this class directly, but you may use reflection to do that. Code doesn't compile in Eclipse IDE - compile it using command
|
|||||||||||||||||||||
|
I can copy my answer from here: Easiest way to cause memory leak in Java? "A memory leak, in computer science (or leakage, in this context), occurs when a computer program consumes memory but is unable to release it back to the operating system." (Wikipedia) The easy answer is: You can't. Java does automatic memory management and will free resources that are not needed for you. You can't stop this from happening. It will ALWAYS be able to release the resources. In programs with manual memory management, this is different. You cann get some memory in C using malloc(). To free the memory, you need the pointer that malloc returned and call free() on it. But if you don't have the pointer anymore (overwritten, or lifetime exceeded), then you are unfortunately incapable of freeing this memory and thus you have a memory leak. All the other answers so far are in my definition not really memory leaks. They all aim at filling the memory with pointless stuff real fast. But at any time you could still dereference the objects you created and thus freeing the memory --> NO LEAK. acconrad's answer comes pretty close though as I have to admit since his solution is effectively to just "crash" the garbage collector by forcing it in an endless loop). The long answer is: You can get a memory leak by writing a library for Java using the JNI, which can have manual memory management and thus have memory leaks. If you call this library, your java process will leak memory. Or, you can have bugs in the JVM, so that the JVM looses memory. There are probably bugs in the JVM, there may even be some known ones since garbage collection is not that trivial, but then it's still a bug. By design this is not possible. You may be asking for some java code that is effected by such a bug. Sorry I don't know one and it might well not be a bug anymore in the next Java version anyway. |
|||||||||||||||||
|
Here's a simple/sinister one via http://wiki.eclipse.org/Performance_Bloopers#String.substring.28.29.
Because the substring refers to the internal representation of the original, much longer string, the original stays in memory. Thus, as long as you have a StringLeaker in play, you have the whole original string in memory, too, even though you might think you're just holding on to a single-character string. The way to avoid storing an unwanted reference to the original string is to do something like this:
For added badness, you might also
Doing so will keep both the original long string and the derived substring in memory even after the StringLeaker instance has been discarded. |
|||||||||||||
|
Take any web application running in any servlet container (Tomcat, Jetty, Glassfish, whatever...). Redeploy the app 10 or 20 times in a row (it may be enough to simply touch the WAR in the server's autodeploy directory. Unless anybody has actually tested this, chances are high that you'll get an OutOfMemoryError after a couple of redeployments, because the application did not take care to clean up after itself. You may even find a bug in your server with this test. The problem is, the lifetime of the container is longer than the lifetime of your application. You have to make sure that all references the container might have to objects or classes of your application can be garbage collected. If there is just one reference surviving the undeployment of your web app, the corresponding classloader and by consequence all classes of your web app cannot be garbage collected. Threads started by your application, ThreadLocal variables, logging appenders are some of the usual suspects to cause classloader leaks. |
||||
|
A common example of this in GUI code is when creating a widget/component and adding a listener to some static/application scoped object and then not removing the listener when the widget is destroyed. Not only do you get a memory leak, but also a performance hit as when whatever you are listening to fires events, all your old listeners are called too. |
|||||
|
Maybe by using external native code through JNI? With pure Java, it is almost impossible. But that is about a "standard" type of memory leak, when you cannot access the memory anymore, but it is still owned by the application. You can instead keep references to unused objects, or open streams without closing them afterwards. |
|||||||||||||||||
|
I have had a nice "memory leak" in relation to PermGen and XML parsing once. The XML parser we used (I can't remember which one it was) did a String.intern() on tag names, to make comparison faster. One of our customers had the great idea to store data values not in XML attributes or text, but as tagnames, so we had a document like:
In fact, they did not use numbers but longer textual IDs (around 20 characters), which were unique and came in at a rate of 10-15 million a day. That makes 200 MB of rubbish a day, which is never needed again, and never GCed (since it is in PermGen). We had permgen set to 512 MB, so it took around two days for the out-of-memory exception (OOME) to arrive... |
|||||||||||||
|
I recently encountered a memory leak situation caused in a way by log4j. Log4j has this mechanism called Nested Diagnostic Context(NDC) which is an instrument to distinguish interleaved log output from different sources. The granularity at which NDC works is threads, so it distinguishes log outputs from different threads separately. In order to store thread specific tags, log4j's NDC class uses a Hashtable which is keyed by the Thread object itself (as opposed to say the thread id), and thus till the NDC tag stays in memory all the objects that hang off of the thread object also stay in memory. In our web application we use NDC to tag logoutputs with a request id to distinguish logs from a single request separately. The container that associates the NDC tag with a thread, also removes it while returning the response from a request. The problem occurred when during the course of processing a request, a child thread was spawned, something like the following code:
So an NDC context was associated with inline thread that was spawned. The thread object that was the key for this NDC context, is the inline thread which has the hugeList object hanging off of it. Hence even after the thread finished doing what it was doing, the reference to the hugeList was kept alive by the NDC context Hastable, thus causing a memory leak. |
|||||||||||||
|
I thought it was interesting that no one used the internal class examples. If you have an internal class; it inherently maintains a reference to the containing class. Of course it is not technically a memory leak because Java WILL eventually clean it up; but this can cause classes to hang around longer than anticipated.
Now if you call Example1 and get an Example2 discarding Example1, you will inherently still have a link to an Example1 object.
I've also heard a rumor that if you have a variable that exists for longer than a specific amount of time; Java assumes that it will always exist and will actually never try to clean it up if cannot be reached in code anymore. But that is completely unverified. |
|||||||||
|
What's a memory leak:
Typical example: A cache of objects is a good starting point to mess things up.
Your cache grows and grows. And pretty soon the entire database gets sucked into memory. A better design uses an LRUMap (Only keeps recently used objects in cache). Sure, you can make things a lot more complicated:
What often happens: If this Info object has references to other objects, which again have references to other objects. In a way you could also consider this to be some kind of memory leak, (caused by bad design). |
||||
|
Create a static Map and keep adding hard references to it. Those will never be GC'd.
|
|||||||||||||||||||||
|
As a lot of people have suggested, Resource Leaks are fairly easy to cause - like the JDBC examples. Actual Memory leaks are a bit harder - especially if you aren't relying on broken bits of the JVM to do it for you... The ideas of creating objects that have a very large footprint and then not being able to access them aren't real memory leaks either. If nothing can access it then it will be garbage collected, and if something can access it then it's not a leak... One way that used to work though - and I don't know if it still does - is to have a three-deep circular chain. As in Object A has a reference to Object B, Object B has a reference to Object C and Object C has a reference to Object A. The GC was clever enough to know that a two deep chain - as in A <--> B - can safely be collected if A and B aren't accessible by anything else, but couldn't handle the three-way chain... |
|||||
|
I came across a more subtle kind of resource leak recently. We open resources via class loader's getResourceAsStream and it happened that the input stream handles were not closed. Uhm, you might say, what an idiot. Well, what makes this interesting is: this way, you can leak heap memory of the underlying process, rather than from JVM's heap. All you need is a jar file with a file inside which will be referenced from Java code. The bigger the jar file, the quicker memory gets allocated. You can easily create such a jar with the following class:
Just paste into a file named BigJarCreator.java, compile and run it from command line:
Et voilà: you find a jar archive in your current working directory with two files inside. Let's create a second class:
This class basically does nothing, but create unreferenced InputStream objects. Those objects will be garbage collected immediately and thus, do not contribute to heap size. It is important for our example to load an existing resource from a jar file, and size does matter here! If you're doubtful, try to compile and start the class above, but make sure to chose a decent heap size (2 MB):
You will not encounter an OOM error here, as no references are kept, the application will keep running no matter how large you chose ITERATIONS in the above example. The memory consumption of your process (visible in top (RES/RSS) or process explorer) grows unless the application gets to the wait command. In the setup above, it will allocate around 150 MB in memory. If you want the application to play safe, close the input stream right where it's created:
and your process will not exceed 35 MB, independent of the iteration count. Quite simple and surprising. |
||||
|
Everyone always forgets the native code route. Here's a simple formula for a leak:
Remember, memory allocations in native code come from the JVM heap. |
|||||
|
You can create a moving memory leak by creating a new instance of a class in that class's finalize method. Bonus points if the finalizer creates multiple instances. Here's a simple program that leaks the entire heap in sometime between a few seconds and a few minutes depending on your heap size:
|
||||
|
I don't think anyone has said this yet: you can resurrect an object by overriding the finalize() method such that finalize() stores a reference of this somewhere. The garbage collector will only be called once on the object so after that the object will never destroyed. |
|||||||||
|
there are many different situations memory will leak. One i encountered, which expose a map that should not be exposed and used in other place.
|
||||
|
Threads are not collected until they terminate. They serve as roots of garbage collection. They are one of the few objects that won't be reclaimed simply by forgetting about them or clearing references to them. Consider: the basic pattern to terminate a worker thread is to set some condition variable seen by the thread. The thread can check the variable periodically and use that as a signal to terminate. If the variable is not declared If you only have a handful of threads these bugs will probably be obvious because your program will stop working properly. If you have a thread pool that creates more threads as needed, then the obsolete/stuck threads might not be noticed, and will accumulate indefinitely, causing a memory leak. Threads are likely to use other data in your application, so will also prevent anything they directly reference from ever being collected. As a toy example:
Call (*edited*) |
|||||||||||||||||||||
|
An example I recently fixed is creating new GC and Image objects, but forgetting to call dispose() method. GC javadoc snippet:
Image javadoc snippet:
|
||||
|
I think that a valid example could be using ThreadLocal variables in an environment where threads are pooled. For instance, using ThreadLocal variables in Servlets to communicate with other web components, having the threads being created by the container and maintaining the idle ones in a pool. ThreadLocal variables, if not correctly cleaned up, will live there until, possibly, the same web component overwrites their values. Of course, once identified, the problem can be solved easily. |
||||
|
The interviewer might have be looking for a circular reference solution:
This is a classic problem with reference counting garbage collectors. You would then politely explain that JVMs use a much more sophisticated algorithm that doesn't have this limitation. -Wes Tarle |
|||||||||||||
|
Thank you for your interest in this question.
Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).
Would you like to answer one of these unanswered questions instead?