Mail Archives: djgpp-workers/1998/12/14/09:57:05
On Mon, 14 Dec 1998, Salvador Eduardo Tropea (SET) wrote:
> 1) The import command imports the files using fopen(file,"r"); and for some
> strange reason Tim used _fmode=O_BINARY
In my experience, it is almost never a good idea to use _fmode=O_BINARY
globally. Most programs need to open some files in text mode and others
in binary mode, so _fmode doesn't save much trouble.
> Anyways that's a serious bug because then the text files are saved in the
> repository with \r\n. I think that here CVS should check the mode and open
> the file according it.
I don't really know much about CVS internals, so what's below are some
general remarks about this text/binary nuisance, mainly in the context of
revision-control software.
Saving a source file with the DOS-style CRLF EOLs is indeed a Bad Idea
(IMHO). The main problem with that is that the master file is then
non-portable to Unix. So I think source files should be checked in with
the CR characters stripped.
However, CVS can be also used to store non-text files. Clearly, those
should have all their characters preserved.
Therefore, opening in text mode in all cases is not a good solution
either. CVS should peek at some portion of the file, decide if it is
text or binary (maybe even provide an option for the user to force one of
these), and then strip all CR characters from the CRLF pairs if that's a
text file. That requires a binary open for reading.
Note that it is a mistake to strip ALL the CR characters, even if they are
not followed by an LF: this will have subtle bugs if the source file has
literal CR characters in it. Only a CR before the LF should be stripped
from a text file. This requires to read the file in binary mode, and
then loop through it manually stripping the CRs, since no library
function is smart enough to do this for you.
A related question is how to write the file when it is checked out.
Clearly, a non-text file should be written verbatim. For a text file, the
answer is less obvious, but I think DOS-style CRLF format is better. For
example, imagine a source file which originally had strings with CRLF
pairs inside it: these would be stripped by check-in, and must be added on
check-out, to prevent program from breaking. Also, many DOS editors still
have trouble with Unix-style EOLs (I was surprised to learn that even
MultiEdit has this bug).
If it is possible, CVS should record the type of the file (text or
binary) in the repository when the file is checked in, and use that info
when it writes the file out later.
The above still has some problems, e.g. when a source file has embedded
strings which end in a single literal newline. But these cases are rare.
> I have a working patch for it that checks the mode and explicitly selects
> "wb" or "wt" (no default assumptions for "w"=xxxx).
Some old non-ANSI compilers don't support "wt" and "wb". You might be
better off defining a bunch of macros, like FOPEN_WBIN_MODE, so that
users of other compilers could define them as appropriate.
> I think is much better than RCS
CVS and RCS are not generally interchangeable, they are designed to work
on different levels and support different development models.
- Raw text -