This article is in response to a discussion at http://stackoverflow.com/questions/18292619/c-string-array-sorting.

The comparison done by Mattingly does two things:

1. It compares apples to oranges: if the object is to create an array of sorted strings, this isn’t equivalent to an array of sorted pointers to C-strings.
2. It measures the time needed to construct `std::string` objects from string literals when using `std::sort()` but directly uses string literals when using `qsort()`. Although there is certainly a cost involved in create `std::string` objects, a roughly similar amount of time is spent in a real example to hold the corresponding C-strings.

Below I posted some code which uses the same tiny data set to compare `std::sort()` and `qsort()` removing the second of these issues in all cases (i.e., the construction of the data set is always removed from the measurements taken). In addition, I have added a way to deal with a bigger data set by reading in the words from a file `sort.in` (I used source of today’s featured Wikipedia entry (http://en.wikipedia.org/wiki/Banksia_violacea) but I’d think any reasonably natural text exhibits similar behavior).

The code also adds a third version of the comparison which uses `std::sort()` but operates on an array of C-strings partly to create something which doesn’t compare apples to oranges but also because it came as a surprise that sorting a large array of `std::string` objects is slower than I expected. I haven’t quite dug to the bottom of the issue but I suspect that the issue the relatively large footprint of `std::string` objects introduced by the now common small string optimization. My measurements indicate that the version shuffling pointers to strings using `std::sort()` is fastest, followed by `qsort()`, with the version directly sorting a larger array of `std::string` objects is slowest, although definitely not by a factor of four. The time measurements I observed using a recent snapshort of gcc are along the lines of those (those for clang seem to do better when using pointers with `std::sort()`):

    sorting sort content of sort.in
    c-sort:   396765
    cpp-ptrs: 375525
    c++-sort: 558646

Especially the amount time taken by `std::sort()` on `std::string` objects is, indeed a surprise. Similar tests in the past indicated that sorting `std::string` using `std::sort()` was faster than sorting C-strings using `qsort()`. I’d blame that on the small string optimization. Still when comparing the same things, `std::sort()` is faster than `qsort()`.

The code I used is below:

    #include <iostream>
    #include <fstream>
    #include <iterator>
    #include <string>
    #include <algorithm>
    #include <mach/mach_time.h>
    #include <string.h>
    #include <vector>
    
    void cpp_sort(std::vector<std::string> values)
    {
        uint64_t ts = mach_absolute_time();
        std::sort(values.begin(), values.end());
        std::cout << “c++-sort: ” << ( mach_absolute_time() – ts ) << ‘\n’;
    }
    
    struct cmp
    {
        bool operator()(char const* a, char const* b) const
        {
            return strcmp(a, b) < 0;
        }
    };
    
    void cpp_ptrs(std::vector<std::string> values)
    {
        std::vector<char const*> temp(values.size());
        std::transform(values.begin(), values.end(), temp.begin(),
                       std::mem_fn(&std::string::c_str));
    
        if (!temp.empty()) {
            uint64_t ts = mach_absolute_time();
            std::sort(temp.begin(), temp.end(), cmp());
            std::cout << “cpp-ptrs: ” << (mach_absolute_time() – ts) << ‘\n’;
        }
    }
    
    static int _sort( const void* a, const void *b )
    {
        return strcmp( *( const char** )a, *( const char** )b );
    }
    
    void c_sort(std::vector<std::string> values)
    {
        std::vector<char const*> temp(values.size());
        std::transform(values.begin(), values.end(), temp.begin(),
                       std::mem_fn(&std::string::c_str));
    
        if (!temp.empty()) {
            uint64_t ts = mach_absolute_time();
            qsort(&temp[0], values.size(), sizeof( const char* ), _sort );
            std::cout << “c-sort:   ” << (mach_absolute_time() – ts) << ‘\n’;
        }
    }
    
    void compare1(std::vector<std::string> const& array)
    {
        c_sort(array);
        cpp_ptrs(array);
        cpp_sort(array);
    }
    
    void compare2(std::vector<std::string> const& array)
    {
        cpp_sort(array);
        cpp_ptrs(array);
        c_sort(array);
    }
    
    
    template <typename T, std::size_t Size>
    std::size_t size(T(&)[Size])
    {
        return Size;
    }
    
    int main(int ac, char *av[])
    {
        switch (ac == 1? 0: atoi(av[1]))
        {
        case 0:
        case 1:
            {
                std::cout << “simple test of fixed array\n”;
                std::string name[] = {
                    “john”, “bobby”, “dear”, “test1”, “catherine”, “nomi”,
                    “shinta”, “martin”, “abe”, “may”, “zeno”, “zack”, “angeal”,
                    “gabby”
                };
                compare1(std::vector<std::string>(name, name + size(name)));
            }
            break;
        case 2:
            {
                std::cout << “simple test of fixed array (just reversed calls)\n”;
                std::string name[] = {
                    “john”, “bobby”, “dear”, “test1”, “catherine”, “nomi”,
                    “shinta”, “martin”, “abe”, “may”, “zeno”, “zack”, “angeal”,
                    “gabby”
                };
                compare2(std::vector<std::string>(name, name + size(name)));
            }
            break;
        case 3:
            {
                std::cout << “sorting sort content of sort.in\n”;
                std::ifstream in(“sort.in”);
                std::istream_iterator<std::string> begin(in), end;
                std::vector<std::string> values(begin, end);
                compare1(values);
            }
            break;
        case 4:
            {
                std::cout << “sorting sort content of sort.in (just reversed calls\n”;
                std::ifstream in(“sort.in”);
                std::istream_iterator<std::string> begin(in), end;
                std::vector<std::string> values(begin, end);
                compare2(values);
            }
            break;
        }
    }