Comment 30 for bug 678808

Revision history for this message
tksharpless (tksharpless-users) wrote :

Logged In: YES
user_id=1511901
Originator: NO

I used MSVC 2005 Standard Edition to build the test program, which eventually took the form below. It shows the same failures as one built with 2008 Express -- see comments with code.

I think this is not really a bug, though it definitely is a limitation of Windows due to the need to support codepage-based character sets. The best solution is clearly to pass only unicode file names to fstream c'tors, then the codepage issues disappear.

--Tom

/* RussianBug.cpp

  Test program to invsetigate a bug reported in hugin where
  std::ifstream fails to open files with Cyrillic names.

  Bug tracker link:
  http://sourceforge.net/tracker/index.php?func=detail&aid=1908349&group_id=77506&atid=550441

  Original by Pablo d'Angelo and MaxTee, who reported the bug
  and says..
"
 I'm using WinXP, English version, Regional settings as follows:

 (Control Panel/Regional Options) Locale set to: Russian;
 (Control panel/Advanced) Language for non-unicode programs: Russian

 I believe these settings set OEM codepage 866 and ANSI codepage 1251.
"

TKS mods:

  Report the effective codepage number.
  Try ifstream() with 4 filename flavors: argv[]; argv[] translated
    to unicode with current codepage and with Russian one; same
 argument read from commandline as unicode.

TKS findings:

  The global Windows codepage setting has no effect on this program.
  The codepage it reports is the one set in the "advanced" option for
  non-unicode programs.

  That codepage determines whether the commandline arguments are read
  correctly into argv[], and apparently also whether ifstream() can
  translate them correctly into unicode (which is the eventual format
  passed to the OS). If they are not read correctly, unknown chars get
  replaced by '?', and translating to unicode fails.

  The ifstream c'tor has some polymorphous ability to accept either ANSI
  or unicode filename arguments; however the ANSI ones must be supported
  by the effective codepage.

  When translation fails, the eventual result is a "file not found" error
  from the OS -- nobody notices the string format problem.

  If the commandline is read as unicode, then the special codepage
  is not needed, and ifstream( unicode_name ) always suceeds.

*/

#include <fstream>
#include <stdio.h>
#include <windows.h>

int main(int argc, char * argv[])
{

 unsigned codepage = GetACP();
 printf("\nThe current Windows code page is %d\n", codepage);

/* read the command line as Unicode */
 int Wargc;
 LPWSTR * Wargv = CommandLineToArgvW( GetCommandLine(), & Wargc );

 for(int i = 1; i < argc; i++ ){

  printf("\n(ANSI) argv[%d] is '%s'\n", i, argv[i]);
  printf("Targv = argv translated to Unicode with current codepage\n" );
  printf("Rargv = argv translated to Unicode with Russian codepage\n" );
  printf("Wargv = argv read as Unicode\n");
  printf("\n");

  wchar_t Targv[200];
  int k = MultiByteToWideChar(
  CP_ACP, // (use current codepage),
  0, // DWORD dwFlags,
  argv[i], // LPCSTR lpMultiByteStr,
  -1, // (is null terminated)
  Targv, // LPWSTR lpWideCharStr,
  200 // int cchWideChar
  );

  wchar_t Rargv[200];
  k = MultiByteToWideChar(
  1251, // (use Russian codepage),
  0, // DWORD dwFlags,
  argv[i], // LPCSTR lpMultiByteStr,
  -1, // (is null terminated)
  Rargv, // LPWSTR lpWideCharStr,
  200 // int cchWideChar
  );

  printf(" fopen( argv ) ");
  FILE * f = fopen( argv[i], "rb");
  if (f) {
  printf("OK\n");
  fclose(f);
  } else {
  printf("FAIL\n");
  }

  printf(" ifstream( argv ) ");
  std::ifstream fin0( argv[i], std::ios::binary );
  if ( fin0.good() ) {
   printf("OK\n");
  } else {
   printf("FAIL\n");
  }

  printf(" ifstream( Targv ) ");
  std::ifstream fin1( Targv, std::ios::binary);
  if ( fin1.good() ) {
   printf("OK\n");
  } else {
   printf("FAIL\n");
  }

  printf(" ifstream( Rargv ) ");
  std::ifstream fin3( Targv, std::ios::binary);
  if ( fin3.good() ) {
   printf("OK\n");
  } else {
   printf("FAIL\n");
  }

  printf(" ifstream( Wargv ) ");
  std::ifstream fin2( Wargv[i], std::ios::binary );
  if ( fin2.good() ) {
   printf("OK\n");
  } else {
   printf("FAIL\n");
  }
  }
}