Consider what happens if you want to erase each element of a collection that meets some condition. Deleting a single element is not an issue because you don't need the iterator afterward. Deleting all elements is similarly easy because you can simply call clear(). Deleting an arbitrary number of elements is generally written this way by most beginners:
set<int>coll;
/* ... */
for (set<int>::iterator itr=coll.begin(); itr!=coll.end(); ++itr)
if (*itr % 2 == 0)
coll.erase(itr);
Predictably, this code doesn't work. The iterator is invalidated when
My first attempted workaround was for the vector class. I wanted to ignore the iterator completely and just step through the collection with an index:
for (unsigned int i=0; i<coll.size(); ++i)
When I tried to call coll.
vector<int> coll;
/* ... */
for (unsigned int i=0; i<coll.size(); ++i)
if (coll[i]%2 == 0)
coll.erase(coll.begin() + i--);
coll.begin() is always a valid iterator (even if it equals "end()"), and you are allowed to use array arithmetic on iterators for vector<> and for deque<>. Although this solutions works, it's a bit of a hack, especially because you have to "fix up" i after the call to
So I was back to figuring out how to get a "valid" iterator after calling
In some STL collections, the
set<int> coll;
/* ... */
for (set<int>::iterator itr=coll.begin(); itr!=coll.end(); ++itr)
if (*itr%2 == 0)
itr = coll.erase(itr);
I liked this solution. No tricks, easy to write, easy to read. But it crashed. Why? Because erase() could return coll.end(), which the for loop would try to increment, which wasn't allowed. So the solution looks like this:
set<int> coll;
/* ... */
set<int>::iterator itr=coll.begin();
while ( itr!=coll.end() )
{
if (*itr%2 == 0)
itr = coll.erase(itr);
else
++itr;
}
The Microsoft version of STL implements map::erase() by returning an iterator, but this is non-standard. I therefore ran into trouble when I switched to STLport and
set<int> coll;
/* ... */
set<int>::iterator itr=coll.begin();
while ( itr!=coll.end() )
{
if (*itr%2 == 0)
{
set<int>::iterator next = itr;
++next;
coll.erase(itr);
itr = next;
}
else
{
++itr;
}
}
Later I learned that the code could be simplified:
set<int> coll;
/* ... */
set<int>::iterator itr=coll.begin();
while ( itr!=coll.end() )
{
if (*itr%2 == 0)
coll.erase(itr++);
else
++itr;
}
Note that we can't use the "hack" from the beginning of this article, which would allow us to write
coll.erase(itr--)
and then increment the iterator in a surrounding for loop. The problem is that a scalar can decrement below zero, but an iterator cannot decrement before begin().Personally, I don't like the autoincrement code above as much as when
The question is, why does this trick work? If you've spent much time in C++, you've inevitably learned about the dangers of code like this:
int i = 5;
i = i++;
The code could end with i equal to either 5 or 6 - the result is undefined because the C++ standard only guarantees that ++ will be executed after i is read and before the semi-colon. It could be incremented before or after the assignment happens. (If you are wondering why, remember that assignments in C++ can be embedded in expressions, unlike most languages.)
To reword the guarantee in the language of the C++ Standard, the only guarantee is that ++ will be executed after i is read and before the next sequence point. The sequence point is the key to understanding our call to
In the expression above, the semicolon is the sequence point. Now look at our code example:
coll.erase(itr++);
If you apply the same evaluation rules, the undefined timing of the execution of ++ would mean that the iterator could be evaluated after
- The value of itr is loaded and pushed on the stack for
erase() . - itr is incremented.
erase() is called.