[Solved] Is atoi multithread safe? [closed]


Its quite easy to implement a replacement for atoi():

int strToInt(const char *text)
{
  int n = 0, sign = 1;
  switch (*text) {
    case '-': sign = -1;
    case '+': ++text;
  }
  for (; isdigit(*text); ++text) n *= 10, n += *text - '0';
  return n * sign;
}

(Demonstration on ideone)

It doesn’t seem to make much sense to replace something which is already available. Thus, I want to mention some thouhgts about this.

The implementation can be adjusted to the precise personal requirements:

  • a check for integer overflow may be added
  • the final value of text may be returned (as in strtol()) to check how many characters have been processed or to do further parsing of other contents
  • a variant might be used for unsigned (which does not accept a sign).
  • preceding spaces may or may not be accepted
  • special syntax may be considered
  • and anything else beyound my imagination.

Extending this idea to other numeric types like e.g. float or double, it becomes even more interesting.

As floating point numbers are definitely subject of localization this has to be considered. (Concerning decimal integer numbers I’m not sure what could be localized but even this might be the case.) If a text file reader with floating point number syntax (like in C) is implemented you may not forget to adjust the locale to C before using strtod() (using setlocale()). (Being a German I’m sensitive to this topic, as in the German locale, the meaning of ‘.’ and ‘,’ are just vice versa like in English.)

{ const char *localeOld = setlocale(LC_ALL, "C");
  value = strtod(text);
  setlocale(LC_ALL, localeOld);
}

Another fact is, that consideration of locale (even if adjusted to C) seems to be somehow expensive. Some years ago, we implemented an own floating point reader as replacement of strtod() which provided a speed-up of 60 … 100 in a COLLADA reader (an XML file format where files often provide lots of floating point numbers).

Update:

Encouraged by the feedback of Paul Floyd, I got curious how faster strToInt() might be. Thus, I built a simple test suite and made some measurements:

#include <assert.h>
#include <ctype.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

int strToInt(const char *text)
{
  int n = 0, sign = 1;
  switch (*text) {
    case '-': sign = -1;
    case '+': ++text;
  }
  for (; isdigit(*text); ++text) n *= 10, n += *text - '0';
  return n * sign;
}

int main(int argc, char **argv)
{
  int n = 10000000; /* default number of measurements */
  /* read command line options */
  if (argc > 1) n = atoi(argv[1]);
  if (n <= 0) return 1; /* ERROR */
  /* build samples */
  assert(sizeof(int) <= 8); /* May be, I want to do it again 20 years ago. */
  /* 24 characters should be capable to hold any decimal for int
   * (upto 64 bit)
   */
  char (*samples)[24] = malloc(n * 24 * sizeof(char));
  if (!samples) {
    printf("ERROR: Cannot allocate samples!\n"
      "(Out of memory.)\n");
    return 1;
  }
  for (int i = 0; i < n; ++i) sprintf(samples[i], "%d", i - (i & 1) * n);
  /* assert correct results, ensure fair caching, pre-heat CPU */
  int *retAToI = malloc(n * sizeof(int));
  if (!retAToI) {
    printf("ERROR: Cannot allocate result array for atoi()!\n"
      "(Out of memory.)\n");
    return 1;
  }
  int *retStrToInt = malloc(n * sizeof(int));
  if (!retStrToInt) {
    printf("ERROR: Cannot allocate result array for strToInt()!\n"
      "(Out of memory.)\n");
    return 1;
  }
  int nErrors = 0;
  for (int i = 0; i < n; ++i) {
    retAToI[i] = atoi(samples[i]); retStrToInt[i] = strToInt(samples[i]);
    if (retAToI[i] != retStrToInt[i]) {
      printf("ERROR: atoi(\"%s\"): %d, strToInt(\"%s\"): %d!\n",
        samples[i], retAToI[i], samples[i], retStrToInt[i]);
      ++nErrors;
    }
  }
  if (nErrors) {
    printf("%d ERRORs found!", nErrors);
    return 2;
  }
  /* do measurements */
  enum { nTries = 10 };
  time_t tTbl[nTries][2];
  for (int i = 0; i < nTries; ++i) {
    printf("Measurement %d:\n", i + 1);
    { time_t t0 = clock();
      for (int i = 0; i < n; ++i) retAToI[i] = atoi(samples[i]);
      tTbl[i][0] = clock() - t0;
    }
    { time_t t0 = clock();
      for (int i = 0; i < n; ++i) retStrToInt[i] = strToInt(samples[i]);
      tTbl[i][1] = clock() - t0;
    }
    /* assert correct results (and prevent that measurement is optimized away) */
    for (int i = 0; i < n; ++i) if (retAToI[i] != retStrToInt[i]) return 3;
  }
  /* report */
  printf("Report:\n");
  printf("%20s|%20s\n", "atoi() ", "strToInt() ");
  printf("--------------------+--------------------\n");
  double tAvg[2] = { 0.0, 0.0 }; const char *sep = "|\n";
  for (int i = 0; i < nTries; ++i) {
    for (int j = 0; j < 2; ++j) {
      double t = (double)tTbl[i][j] / CLOCKS_PER_SEC;
      printf("%19.3f %c", t, sep[j]);
      tAvg[j] += t;
    }
  }
  printf("--------------------+--------------------\n");
  for (int j = 0; j < 2; ++j) printf("%19.3f %c", tAvg[j] / nTries, sep[j]);
  /* done */
  return 0;
}

I tried this on some platforms.

VS2013 on Windows 10 (64 bit), Release mode:

Report:
             atoi() |         strToInt()
--------------------+--------------------
              0.232 |              0.200
              0.310 |              0.240
              0.253 |              0.199
              0.231 |              0.201
              0.232 |              0.253
              0.247 |              0.201
              0.238 |              0.201
              0.247 |              0.223
              0.248 |              0.200
              0.249 |              0.200
--------------------+--------------------
              0.249 |              0.212

gcc 5.4.0 on cygwin, Windows 10 (64 bit), gcc -std=c11 -O2:

Report:
             atoi() |         strToInt() 
--------------------+--------------------
              0.360 |              0.312 
              0.391 |              0.250 
              0.360 |              0.328 
              0.391 |              0.312 
              0.375 |              0.281 
              0.359 |              0.282 
              0.375 |              0.297 
              0.391 |              0.250 
              0.359 |              0.297 
              0.406 |              0.281 
--------------------+--------------------
              0.377 |              0.289

Sample uploaded and executed on codingground
gcc 4.8.5 on Linux 3.10.0-327.36.3.el7.x86_64, gcc -std=c11 -O2:

Report:
             atoi() |         strToInt() 
--------------------+--------------------
              1.080 |              0.750 
              1.000 |              0.780 
              0.980 |              0.770 
              1.010 |              0.770 
              1.000 |              0.770 
              1.010 |              0.780 
              1.010 |              0.780 
              1.010 |              0.770 
              1.020 |              0.780 
              1.020 |              0.780 
--------------------+--------------------
              1.014 |              0.773 

Well, strToInt() is a little bit faster. (Without -O2, it was even slower than atoi() but the standard library was probably optimized too.)

Note:

As the time measurement involves assignment and loop operations, this provides a qualitative statement about which one is faster. It doesn’t provide a quantitative factor. (To get one, the measurement would become much more complicated.)

Due to the simplicity of atoi(), the application had to use it very often until it becomes even worth to consider the development effort…

4

solved Is atoi multithread safe? [closed]