ArrayList vs LinkedList

princec · August 3, 2012, 12:35pm

I think I’ve said before that there is almost no actual utility whatsoever for a RAM-based linked list implementation. There is no software problem anyone will encounter that will not be better faster and more efficiently by ArrayList.

Cas

Riven · August 3, 2012, 12:39pm

The only usecase I can think of is an extremely long queue (millions of elements) where realtime behaviour is required (ruling out array based queues as they have to grow/shift/compact sooner or later).

It’s very unlikely to have this requirement though.

Roquen · August 3, 2012, 1:03pm

Free lists of pool allocated items comes to mind (edit: but that’s in-place so linked lists vs. LinkedList).

sproingie · August 3, 2012, 3:46pm

It does. Mark and sweep is one of the slowest forms of GC, and the more pointers you have to chase, the more time it takes. Java has to do M&S with the tenured generation, whereas it has fast copying collectors for the young generation. The more fragmented your memory gets, the less it’s able to take advantage of cache, and once your app starts paging to disk, the page thrashing can really kill your app (sometimes quite literally)

princec · August 3, 2012, 3:50pm

That’s one of the G1 GC’s strengths apparently - it performs the mark/sweep operation concurrently, and in small chunks, compacting collected stuff into contiguous RAM, which should then in theory make it a bit more cache-friendly next time.

Cas

Gudradain · August 3, 2012, 5:54pm

I have to agree with the fact that we will probably never need something else than an ArrayList for game. I tried inserting integer into an arraylist, always in the middle of the list. And I didn’t got any performance hit before 10000 elements… After that the performance degrade really fast but I don’t think a lot of us are using more than 10000 elements.

Note : With Mark-sweep compact collector there is no fragmentation. I thought Java was using that one instead of Mark-sweep collector for the tenured generation.

@princec : The Concurrent collector is also supposed to do that, but when I tried it the performance were absolutely awful

The concurrent collector performs most of its work concurrently (i.e., while the application is still running) to keep garbage collection pauses short. It is designed for applications with medium- to large-sized data sets for which response time is more important than overall throughput, since the techniques used to minimize pauses can reduce application performance. The concurrent collector is enabled with the option -XX:+UseConcMarkSweepGC.
</off topic>

princec · August 3, 2012, 11:50pm

Just for fun I did some raw microbenchmarks comparing the ways to manage a hypothetical particle or game entity system. Using the sweep-and-copy with ArrayList is approximately 4x faster than LinkedList on its own, and uses less memory. Using iterators slows things down by between 10% and 25%.

However… this was testing with 500,000 entities/particles, and I was getting times of under 1ms for ArrayList and only 4ms for LinkedList. A more realistic sort of scenario with the kinds of games we make round these parts, with maybe 1/100th the number of entities, would mean that the amount of time spent culling entities or particles in a frame would be probably unmeasurable.

Furthermore with just 5,000 entities, we were talking only 2ms using brute force on an ArrayList. And this is for the pathologically unlikely scenario that you’re killing all 5,000 entities in a single frame.

I think this might go back to that fascinating talk given by Jon Blow on making computer games. Please watch it. It is massively significant to how good you will be at making games, or even just software.

Cas

gimbal · August 7, 2012, 7:57am

That is certainly going on the ‘watch ASAP’ list. Conversely if you have any more of those little treasures floating around in your bookmarks, don’t hesitate to spam

princec · August 7, 2012, 8:41am

ONLY THIS. I ams as robots.

Cas

Roquen · August 7, 2012, 10:12am

This is worth a quick read IHMO: http://blogs.valvesoftware.com/abrash/valve-how-i-got-here-what-its-like-and-what-im-doing-2/

gene9 · September 18, 2012, 11:13pm

Generally speaking:

If preserving element order is important and insert/remove before/after operations are common, linked list is much faster at those uses than arrays.
If you want a persistent data structure (in the sense of immutability, not persisting to storage which is another use of the same word), linked lists are ideal.
In most other cases, use arrays.

BTW, you can do O(1) removes on arrays if you can alter element order (swap element with last and remove last)

This is a very elementary, academic subject. I wouldn’t think it is fully in line with this site.

Abuse · September 19, 2012, 12:24am

I appreciate what he’s talking about, and when you’re hand-coding the data structures in question then it makes sense to keep it simple.
However, we have access to Java’s wonderful Collections api; so using a HashXYZ is no more effort than using a less complex structure like an ArrayList.

princec · September 19, 2012, 1:21pm

It’s not so much about effort as what’s fastest in what circumstances. LinkedList is an academic curiosity. No-one in their right mind should be using it.

Cas

Best_Username_Ever · September 19, 2012, 10:02pm

I tested the performance of several List implementations. I couldn’t decide what to do with collection size vs add/remove frequency, but it was surprising. Frequent adds/removes on relatively small linked-lists-based Lists out performed the same frequency, but large lists were the opposite. I couldn’t find a relationship between size and frequency and did not want to spend time testing and plotting all kinds of possibilities, so I decided it would be too hard to make a meaningful benchmark.

I would not make generalizations on linked lists vs arrays. I would just use Lists in the form List<Thing> someList = new ListType<Thing>(); and replace the implementation later if needed. In general, you will more likely than not see better performance from ArrayLists than LinkedLists for games on PCs, but that’s not universally true for all hardware (although a real time system is not the usual platform for games) or all applications. I also think it’s naive to use either LinkedList or ArrayLists as an automatic choice of data structure.

princec · September 19, 2012, 10:43pm

Small aside: were you to implement a linked list in the traditional way - with forward and maybe back pointers actually inside the instances you were linking - you would alleviate some of the problems that LinkedList has, which revolve around the need to implement the List interface, and the creation of little bits of potentially long lived support garbage all over the place.

Cas

TheGDeveloper · November 14, 2012, 10:34pm

the add method on ArrayList is not O(1), if it needs memory reallocation is O(n).
In general you are using ArrayLists when you are not add and remove many times and LinkedList when you do not need random access onto the objects

TheGDeveloper · November 14, 2012, 10:39pm

traverse is faster because there is the CPU cache. When you reference a memory location the architectures laods on the cache and some more memory that is near

divxdede · November 14, 2012, 11:09pm

In fact

adding an item on an ArrayList can’t be considered as O(1) cost since an array list may perform an array re-allocation if the current allocation becomes fullor unsufficient.
This method perform a full instanciation of a new array and a full copy that have a cost that depends of the size of the list.

So, this threads talk about the memory fragmentation of a LinkedList but not the wast space of an ArrayList and the re-allocation process that may be a real penalty in adding operations.

By example,
Create an ArrayList with the default constructor that mean an initial capacity of 10 and you will insert on your list 600 items.
@11th insertion ==> array[10] copied on a new array[15]
@15th insertion ==> array[15] copied on a new array[22]
@23th insertion ==> array[22] copied on a new array[33]
@34th insertion ==> array[33] copied on a new array[49]
@50th insertion ==> array[49] copied on a new array[73]
@74th insertion ==> array[73] copied on a new array[109]
@110th insertion ==> array[109] copied on a new array[163]
@164th insertion ==> array[163] copied on a new array[244]
@245th insertion ==> array[244] copied on a new array[366]
@367th insertion ==> array[366] copied on a new array[549]
@550th insertion ==> array[549] copied on a new array[823]
@600th insertion ==> you have an array of 823 that means a wast of 223 bullets and 11 arrays instanciations and copy.

Hopefully in often/some cases, we can estimate the final size of a list before filling it. When it’s the case, we should consider to use the good one constructor in order to prevent theses reallocations. With a LinkedList you haven’t this issue.

theagentd · November 15, 2012, 12:20am

… because you can’t possibly predict an approximate capacity and/or you can’t stand a single 0.01ms spike to copy 100 objects from one array to another with a super-fast natively optimized Arrays.copyOf() called ONCE? Instead you rely on an linked list which is slower to add to, produces garbage when removed from and is slower to traverse? Just create a large enough ArrayList. It’s gonna use a lot less memory than a LinkedList since that one creates a Node object for each entry, and

[quote]a normal object requires 8 bytes of “housekeeping” space
[/quote]
(http://www.javamex.com/tutorials/memory/object_memory_usage.shtml)
plus 3 references * 4 bytes each. That’s 5 times as much data as an ArrayList stores per slot. Add cache inefficiency since the CPU can’t predict which object we’ll read, and it’s seriously not worth it.

add(): O(1) as long as you have a large enough array, which you will eventually get and keep forever. Better than LinkedList’s O(1) since it doesn’t produce garbage
remove(): not needed, we’ll never have to remove an object if we use the double ArrayList technique I’ve linked to over 9000 times by now.
get(index): O(1), much faster than LinkedList.
get in sequence: Faster due to cache coherency, doesn’t need to create an iterator object every time you want to loop over in sequence.
memory usage: 1/5th as much per object.

No, instead we have shitty performance all the time.

princec · November 15, 2012, 9:54am

Never use LinkedList!

You heard it here first.

Cas