| This is libc.info, produced by makeinfo version 5.2 from libc.texinfo. |
| |
| This file documents the GNU C Library. |
| |
| This is 'The GNU C Library Reference Manual', for version 2.19 |
| (Buildroot). |
| |
| Copyright (C) 1993-2014 Free Software Foundation, Inc. |
| |
| Permission is granted to copy, distribute and/or modify this document |
| under the terms of the GNU Free Documentation License, Version 1.3 or |
| any later version published by the Free Software Foundation; with the |
| Invariant Sections being "Free Software Needs Free Documentation" and |
| "GNU Lesser General Public License", the Front-Cover texts being "A GNU |
| Manual", and with the Back-Cover Texts as in (a) below. A copy of the |
| license is included in the section entitled "GNU Free Documentation |
| License". |
| |
| (a) The FSF's Back-Cover Text is: "You have the freedom to copy and |
| modify this GNU manual. Buying copies from the FSF supports it in |
| developing GNU and promoting software freedom." |
| INFO-DIR-SECTION Software libraries |
| START-INFO-DIR-ENTRY |
| * Libc: (libc). C library. |
| END-INFO-DIR-ENTRY |
| |
| INFO-DIR-SECTION GNU C library functions and macros |
| START-INFO-DIR-ENTRY |
| * ALTWERASE: (libc)Local Modes. |
| * ARGP_ERR_UNKNOWN: (libc)Argp Parser Functions. |
| * ARG_MAX: (libc)General Limits. |
| * BC_BASE_MAX: (libc)Utility Limits. |
| * BC_DIM_MAX: (libc)Utility Limits. |
| * BC_SCALE_MAX: (libc)Utility Limits. |
| * BC_STRING_MAX: (libc)Utility Limits. |
| * BRKINT: (libc)Input Modes. |
| * BUFSIZ: (libc)Controlling Buffering. |
| * CCTS_OFLOW: (libc)Control Modes. |
| * CHILD_MAX: (libc)General Limits. |
| * CIGNORE: (libc)Control Modes. |
| * CLK_TCK: (libc)Processor Time. |
| * CLOCAL: (libc)Control Modes. |
| * CLOCKS_PER_SEC: (libc)CPU Time. |
| * COLL_WEIGHTS_MAX: (libc)Utility Limits. |
| * CPU_CLR: (libc)CPU Affinity. |
| * CPU_ISSET: (libc)CPU Affinity. |
| * CPU_SET: (libc)CPU Affinity. |
| * CPU_SETSIZE: (libc)CPU Affinity. |
| * CPU_ZERO: (libc)CPU Affinity. |
| * CREAD: (libc)Control Modes. |
| * CRTS_IFLOW: (libc)Control Modes. |
| * CS5: (libc)Control Modes. |
| * CS6: (libc)Control Modes. |
| * CS7: (libc)Control Modes. |
| * CS8: (libc)Control Modes. |
| * CSIZE: (libc)Control Modes. |
| * CSTOPB: (libc)Control Modes. |
| * DES_FAILED: (libc)DES Encryption. |
| * DTTOIF: (libc)Directory Entries. |
| * E2BIG: (libc)Error Codes. |
| * EACCES: (libc)Error Codes. |
| * EADDRINUSE: (libc)Error Codes. |
| * EADDRNOTAVAIL: (libc)Error Codes. |
| * EADV: (libc)Error Codes. |
| * EAFNOSUPPORT: (libc)Error Codes. |
| * EAGAIN: (libc)Error Codes. |
| * EALREADY: (libc)Error Codes. |
| * EAUTH: (libc)Error Codes. |
| * EBACKGROUND: (libc)Error Codes. |
| * EBADE: (libc)Error Codes. |
| * EBADF: (libc)Error Codes. |
| * EBADFD: (libc)Error Codes. |
| * EBADMSG: (libc)Error Codes. |
| * EBADR: (libc)Error Codes. |
| * EBADRPC: (libc)Error Codes. |
| * EBADRQC: (libc)Error Codes. |
| * EBADSLT: (libc)Error Codes. |
| * EBFONT: (libc)Error Codes. |
| * EBUSY: (libc)Error Codes. |
| * ECANCELED: (libc)Error Codes. |
| * ECHILD: (libc)Error Codes. |
| * ECHO: (libc)Local Modes. |
| * ECHOCTL: (libc)Local Modes. |
| * ECHOE: (libc)Local Modes. |
| * ECHOK: (libc)Local Modes. |
| * ECHOKE: (libc)Local Modes. |
| * ECHONL: (libc)Local Modes. |
| * ECHOPRT: (libc)Local Modes. |
| * ECHRNG: (libc)Error Codes. |
| * ECOMM: (libc)Error Codes. |
| * ECONNABORTED: (libc)Error Codes. |
| * ECONNREFUSED: (libc)Error Codes. |
| * ECONNRESET: (libc)Error Codes. |
| * ED: (libc)Error Codes. |
| * EDEADLK: (libc)Error Codes. |
| * EDEADLOCK: (libc)Error Codes. |
| * EDESTADDRREQ: (libc)Error Codes. |
| * EDIED: (libc)Error Codes. |
| * EDOM: (libc)Error Codes. |
| * EDOTDOT: (libc)Error Codes. |
| * EDQUOT: (libc)Error Codes. |
| * EEXIST: (libc)Error Codes. |
| * EFAULT: (libc)Error Codes. |
| * EFBIG: (libc)Error Codes. |
| * EFTYPE: (libc)Error Codes. |
| * EGRATUITOUS: (libc)Error Codes. |
| * EGREGIOUS: (libc)Error Codes. |
| * EHOSTDOWN: (libc)Error Codes. |
| * EHOSTUNREACH: (libc)Error Codes. |
| * EHWPOISON: (libc)Error Codes. |
| * EIDRM: (libc)Error Codes. |
| * EIEIO: (libc)Error Codes. |
| * EILSEQ: (libc)Error Codes. |
| * EINPROGRESS: (libc)Error Codes. |
| * EINTR: (libc)Error Codes. |
| * EINVAL: (libc)Error Codes. |
| * EIO: (libc)Error Codes. |
| * EISCONN: (libc)Error Codes. |
| * EISDIR: (libc)Error Codes. |
| * EISNAM: (libc)Error Codes. |
| * EKEYEXPIRED: (libc)Error Codes. |
| * EKEYREJECTED: (libc)Error Codes. |
| * EKEYREVOKED: (libc)Error Codes. |
| * EL2HLT: (libc)Error Codes. |
| * EL2NSYNC: (libc)Error Codes. |
| * EL3HLT: (libc)Error Codes. |
| * EL3RST: (libc)Error Codes. |
| * ELIBACC: (libc)Error Codes. |
| * ELIBBAD: (libc)Error Codes. |
| * ELIBEXEC: (libc)Error Codes. |
| * ELIBMAX: (libc)Error Codes. |
| * ELIBSCN: (libc)Error Codes. |
| * ELNRNG: (libc)Error Codes. |
| * ELOOP: (libc)Error Codes. |
| * EMEDIUMTYPE: (libc)Error Codes. |
| * EMFILE: (libc)Error Codes. |
| * EMLINK: (libc)Error Codes. |
| * EMSGSIZE: (libc)Error Codes. |
| * EMULTIHOP: (libc)Error Codes. |
| * ENAMETOOLONG: (libc)Error Codes. |
| * ENAVAIL: (libc)Error Codes. |
| * ENEEDAUTH: (libc)Error Codes. |
| * ENETDOWN: (libc)Error Codes. |
| * ENETRESET: (libc)Error Codes. |
| * ENETUNREACH: (libc)Error Codes. |
| * ENFILE: (libc)Error Codes. |
| * ENOANO: (libc)Error Codes. |
| * ENOBUFS: (libc)Error Codes. |
| * ENOCSI: (libc)Error Codes. |
| * ENODATA: (libc)Error Codes. |
| * ENODEV: (libc)Error Codes. |
| * ENOENT: (libc)Error Codes. |
| * ENOEXEC: (libc)Error Codes. |
| * ENOKEY: (libc)Error Codes. |
| * ENOLCK: (libc)Error Codes. |
| * ENOLINK: (libc)Error Codes. |
| * ENOMEDIUM: (libc)Error Codes. |
| * ENOMEM: (libc)Error Codes. |
| * ENOMSG: (libc)Error Codes. |
| * ENONET: (libc)Error Codes. |
| * ENOPKG: (libc)Error Codes. |
| * ENOPROTOOPT: (libc)Error Codes. |
| * ENOSPC: (libc)Error Codes. |
| * ENOSR: (libc)Error Codes. |
| * ENOSTR: (libc)Error Codes. |
| * ENOSYS: (libc)Error Codes. |
| * ENOTBLK: (libc)Error Codes. |
| * ENOTCONN: (libc)Error Codes. |
| * ENOTDIR: (libc)Error Codes. |
| * ENOTEMPTY: (libc)Error Codes. |
| * ENOTNAM: (libc)Error Codes. |
| * ENOTRECOVERABLE: (libc)Error Codes. |
| * ENOTSOCK: (libc)Error Codes. |
| * ENOTSUP: (libc)Error Codes. |
| * ENOTTY: (libc)Error Codes. |
| * ENOTUNIQ: (libc)Error Codes. |
| * ENXIO: (libc)Error Codes. |
| * EOF: (libc)EOF and Errors. |
| * EOPNOTSUPP: (libc)Error Codes. |
| * EOVERFLOW: (libc)Error Codes. |
| * EOWNERDEAD: (libc)Error Codes. |
| * EPERM: (libc)Error Codes. |
| * EPFNOSUPPORT: (libc)Error Codes. |
| * EPIPE: (libc)Error Codes. |
| * EPROCLIM: (libc)Error Codes. |
| * EPROCUNAVAIL: (libc)Error Codes. |
| * EPROGMISMATCH: (libc)Error Codes. |
| * EPROGUNAVAIL: (libc)Error Codes. |
| * EPROTO: (libc)Error Codes. |
| * EPROTONOSUPPORT: (libc)Error Codes. |
| * EPROTOTYPE: (libc)Error Codes. |
| * EQUIV_CLASS_MAX: (libc)Utility Limits. |
| * ERANGE: (libc)Error Codes. |
| * EREMCHG: (libc)Error Codes. |
| * EREMOTE: (libc)Error Codes. |
| * EREMOTEIO: (libc)Error Codes. |
| * ERESTART: (libc)Error Codes. |
| * ERFKILL: (libc)Error Codes. |
| * EROFS: (libc)Error Codes. |
| * ERPCMISMATCH: (libc)Error Codes. |
| * ESHUTDOWN: (libc)Error Codes. |
| * ESOCKTNOSUPPORT: (libc)Error Codes. |
| * ESPIPE: (libc)Error Codes. |
| * ESRCH: (libc)Error Codes. |
| * ESRMNT: (libc)Error Codes. |
| * ESTALE: (libc)Error Codes. |
| * ESTRPIPE: (libc)Error Codes. |
| * ETIME: (libc)Error Codes. |
| * ETIMEDOUT: (libc)Error Codes. |
| * ETOOMANYREFS: (libc)Error Codes. |
| * ETXTBSY: (libc)Error Codes. |
| * EUCLEAN: (libc)Error Codes. |
| * EUNATCH: (libc)Error Codes. |
| * EUSERS: (libc)Error Codes. |
| * EWOULDBLOCK: (libc)Error Codes. |
| * EXDEV: (libc)Error Codes. |
| * EXFULL: (libc)Error Codes. |
| * EXIT_FAILURE: (libc)Exit Status. |
| * EXIT_SUCCESS: (libc)Exit Status. |
| * EXPR_NEST_MAX: (libc)Utility Limits. |
| * FD_CLOEXEC: (libc)Descriptor Flags. |
| * FD_CLR: (libc)Waiting for I/O. |
| * FD_ISSET: (libc)Waiting for I/O. |
| * FD_SET: (libc)Waiting for I/O. |
| * FD_SETSIZE: (libc)Waiting for I/O. |
| * FD_ZERO: (libc)Waiting for I/O. |
| * FILENAME_MAX: (libc)Limits for Files. |
| * FLUSHO: (libc)Local Modes. |
| * FOPEN_MAX: (libc)Opening Streams. |
| * FP_ILOGB0: (libc)Exponents and Logarithms. |
| * FP_ILOGBNAN: (libc)Exponents and Logarithms. |
| * F_DUPFD: (libc)Duplicating Descriptors. |
| * F_GETFD: (libc)Descriptor Flags. |
| * F_GETFL: (libc)Getting File Status Flags. |
| * F_GETLK: (libc)File Locks. |
| * F_GETOWN: (libc)Interrupt Input. |
| * F_OK: (libc)Testing File Access. |
| * F_SETFD: (libc)Descriptor Flags. |
| * F_SETFL: (libc)Getting File Status Flags. |
| * F_SETLK: (libc)File Locks. |
| * F_SETLKW: (libc)File Locks. |
| * F_SETOWN: (libc)Interrupt Input. |
| * HUGE_VAL: (libc)Math Error Reporting. |
| * HUGE_VALF: (libc)Math Error Reporting. |
| * HUGE_VALL: (libc)Math Error Reporting. |
| * HUPCL: (libc)Control Modes. |
| * I: (libc)Complex Numbers. |
| * ICANON: (libc)Local Modes. |
| * ICRNL: (libc)Input Modes. |
| * IEXTEN: (libc)Local Modes. |
| * IFNAMSIZ: (libc)Interface Naming. |
| * IFTODT: (libc)Directory Entries. |
| * IGNBRK: (libc)Input Modes. |
| * IGNCR: (libc)Input Modes. |
| * IGNPAR: (libc)Input Modes. |
| * IMAXBEL: (libc)Input Modes. |
| * INADDR_ANY: (libc)Host Address Data Type. |
| * INADDR_BROADCAST: (libc)Host Address Data Type. |
| * INADDR_LOOPBACK: (libc)Host Address Data Type. |
| * INADDR_NONE: (libc)Host Address Data Type. |
| * INFINITY: (libc)Infinity and NaN. |
| * INLCR: (libc)Input Modes. |
| * INPCK: (libc)Input Modes. |
| * IPPORT_RESERVED: (libc)Ports. |
| * IPPORT_USERRESERVED: (libc)Ports. |
| * ISIG: (libc)Local Modes. |
| * ISTRIP: (libc)Input Modes. |
| * IXANY: (libc)Input Modes. |
| * IXOFF: (libc)Input Modes. |
| * IXON: (libc)Input Modes. |
| * LINE_MAX: (libc)Utility Limits. |
| * LINK_MAX: (libc)Limits for Files. |
| * L_ctermid: (libc)Identifying the Terminal. |
| * L_cuserid: (libc)Who Logged In. |
| * L_tmpnam: (libc)Temporary Files. |
| * MAXNAMLEN: (libc)Limits for Files. |
| * MAXSYMLINKS: (libc)Symbolic Links. |
| * MAX_CANON: (libc)Limits for Files. |
| * MAX_INPUT: (libc)Limits for Files. |
| * MB_CUR_MAX: (libc)Selecting the Conversion. |
| * MB_LEN_MAX: (libc)Selecting the Conversion. |
| * MDMBUF: (libc)Control Modes. |
| * MSG_DONTROUTE: (libc)Socket Data Options. |
| * MSG_OOB: (libc)Socket Data Options. |
| * MSG_PEEK: (libc)Socket Data Options. |
| * NAME_MAX: (libc)Limits for Files. |
| * NAN: (libc)Infinity and NaN. |
| * NCCS: (libc)Mode Data Types. |
| * NGROUPS_MAX: (libc)General Limits. |
| * NOFLSH: (libc)Local Modes. |
| * NOKERNINFO: (libc)Local Modes. |
| * NSIG: (libc)Standard Signals. |
| * NULL: (libc)Null Pointer Constant. |
| * ONLCR: (libc)Output Modes. |
| * ONOEOT: (libc)Output Modes. |
| * OPEN_MAX: (libc)General Limits. |
| * OPOST: (libc)Output Modes. |
| * OXTABS: (libc)Output Modes. |
| * O_ACCMODE: (libc)Access Modes. |
| * O_APPEND: (libc)Operating Modes. |
| * O_ASYNC: (libc)Operating Modes. |
| * O_CREAT: (libc)Open-time Flags. |
| * O_EXCL: (libc)Open-time Flags. |
| * O_EXEC: (libc)Access Modes. |
| * O_EXLOCK: (libc)Open-time Flags. |
| * O_FSYNC: (libc)Operating Modes. |
| * O_IGNORE_CTTY: (libc)Open-time Flags. |
| * O_NDELAY: (libc)Operating Modes. |
| * O_NOATIME: (libc)Operating Modes. |
| * O_NOCTTY: (libc)Open-time Flags. |
| * O_NOLINK: (libc)Open-time Flags. |
| * O_NONBLOCK: (libc)Open-time Flags. |
| * O_NONBLOCK: (libc)Operating Modes. |
| * O_NOTRANS: (libc)Open-time Flags. |
| * O_RDONLY: (libc)Access Modes. |
| * O_RDWR: (libc)Access Modes. |
| * O_READ: (libc)Access Modes. |
| * O_SHLOCK: (libc)Open-time Flags. |
| * O_SYNC: (libc)Operating Modes. |
| * O_TRUNC: (libc)Open-time Flags. |
| * O_WRITE: (libc)Access Modes. |
| * O_WRONLY: (libc)Access Modes. |
| * PARENB: (libc)Control Modes. |
| * PARMRK: (libc)Input Modes. |
| * PARODD: (libc)Control Modes. |
| * PATH_MAX: (libc)Limits for Files. |
| * PA_FLAG_MASK: (libc)Parsing a Template String. |
| * PENDIN: (libc)Local Modes. |
| * PF_FILE: (libc)Local Namespace Details. |
| * PF_INET6: (libc)Internet Namespace. |
| * PF_INET: (libc)Internet Namespace. |
| * PF_LOCAL: (libc)Local Namespace Details. |
| * PF_UNIX: (libc)Local Namespace Details. |
| * PIPE_BUF: (libc)Limits for Files. |
| * P_tmpdir: (libc)Temporary Files. |
| * RAND_MAX: (libc)ISO Random. |
| * RE_DUP_MAX: (libc)General Limits. |
| * RLIM_INFINITY: (libc)Limits on Resources. |
| * R_OK: (libc)Testing File Access. |
| * SA_NOCLDSTOP: (libc)Flags for Sigaction. |
| * SA_ONSTACK: (libc)Flags for Sigaction. |
| * SA_RESTART: (libc)Flags for Sigaction. |
| * SEEK_CUR: (libc)File Positioning. |
| * SEEK_END: (libc)File Positioning. |
| * SEEK_SET: (libc)File Positioning. |
| * SIGABRT: (libc)Program Error Signals. |
| * SIGALRM: (libc)Alarm Signals. |
| * SIGBUS: (libc)Program Error Signals. |
| * SIGCHLD: (libc)Job Control Signals. |
| * SIGCLD: (libc)Job Control Signals. |
| * SIGCONT: (libc)Job Control Signals. |
| * SIGEMT: (libc)Program Error Signals. |
| * SIGFPE: (libc)Program Error Signals. |
| * SIGHUP: (libc)Termination Signals. |
| * SIGILL: (libc)Program Error Signals. |
| * SIGINFO: (libc)Miscellaneous Signals. |
| * SIGINT: (libc)Termination Signals. |
| * SIGIO: (libc)Asynchronous I/O Signals. |
| * SIGIOT: (libc)Program Error Signals. |
| * SIGKILL: (libc)Termination Signals. |
| * SIGLOST: (libc)Operation Error Signals. |
| * SIGPIPE: (libc)Operation Error Signals. |
| * SIGPOLL: (libc)Asynchronous I/O Signals. |
| * SIGPROF: (libc)Alarm Signals. |
| * SIGQUIT: (libc)Termination Signals. |
| * SIGSEGV: (libc)Program Error Signals. |
| * SIGSTOP: (libc)Job Control Signals. |
| * SIGSYS: (libc)Program Error Signals. |
| * SIGTERM: (libc)Termination Signals. |
| * SIGTRAP: (libc)Program Error Signals. |
| * SIGTSTP: (libc)Job Control Signals. |
| * SIGTTIN: (libc)Job Control Signals. |
| * SIGTTOU: (libc)Job Control Signals. |
| * SIGURG: (libc)Asynchronous I/O Signals. |
| * SIGUSR1: (libc)Miscellaneous Signals. |
| * SIGUSR2: (libc)Miscellaneous Signals. |
| * SIGVTALRM: (libc)Alarm Signals. |
| * SIGWINCH: (libc)Miscellaneous Signals. |
| * SIGXCPU: (libc)Operation Error Signals. |
| * SIGXFSZ: (libc)Operation Error Signals. |
| * SIG_ERR: (libc)Basic Signal Handling. |
| * SOCK_DGRAM: (libc)Communication Styles. |
| * SOCK_RAW: (libc)Communication Styles. |
| * SOCK_RDM: (libc)Communication Styles. |
| * SOCK_SEQPACKET: (libc)Communication Styles. |
| * SOCK_STREAM: (libc)Communication Styles. |
| * SOL_SOCKET: (libc)Socket-Level Options. |
| * SSIZE_MAX: (libc)General Limits. |
| * STREAM_MAX: (libc)General Limits. |
| * SUN_LEN: (libc)Local Namespace Details. |
| * SV_INTERRUPT: (libc)BSD Handler. |
| * SV_ONSTACK: (libc)BSD Handler. |
| * SV_RESETHAND: (libc)BSD Handler. |
| * S_IFMT: (libc)Testing File Type. |
| * S_ISBLK: (libc)Testing File Type. |
| * S_ISCHR: (libc)Testing File Type. |
| * S_ISDIR: (libc)Testing File Type. |
| * S_ISFIFO: (libc)Testing File Type. |
| * S_ISLNK: (libc)Testing File Type. |
| * S_ISREG: (libc)Testing File Type. |
| * S_ISSOCK: (libc)Testing File Type. |
| * S_TYPEISMQ: (libc)Testing File Type. |
| * S_TYPEISSEM: (libc)Testing File Type. |
| * S_TYPEISSHM: (libc)Testing File Type. |
| * TMP_MAX: (libc)Temporary Files. |
| * TOSTOP: (libc)Local Modes. |
| * TZNAME_MAX: (libc)General Limits. |
| * VDISCARD: (libc)Other Special. |
| * VDSUSP: (libc)Signal Characters. |
| * VEOF: (libc)Editing Characters. |
| * VEOL2: (libc)Editing Characters. |
| * VEOL: (libc)Editing Characters. |
| * VERASE: (libc)Editing Characters. |
| * VINTR: (libc)Signal Characters. |
| * VKILL: (libc)Editing Characters. |
| * VLNEXT: (libc)Other Special. |
| * VMIN: (libc)Noncanonical Input. |
| * VQUIT: (libc)Signal Characters. |
| * VREPRINT: (libc)Editing Characters. |
| * VSTART: (libc)Start/Stop Characters. |
| * VSTATUS: (libc)Other Special. |
| * VSTOP: (libc)Start/Stop Characters. |
| * VSUSP: (libc)Signal Characters. |
| * VTIME: (libc)Noncanonical Input. |
| * VWERASE: (libc)Editing Characters. |
| * WCHAR_MAX: (libc)Extended Char Intro. |
| * WCHAR_MIN: (libc)Extended Char Intro. |
| * WCOREDUMP: (libc)Process Completion Status. |
| * WEOF: (libc)EOF and Errors. |
| * WEOF: (libc)Extended Char Intro. |
| * WEXITSTATUS: (libc)Process Completion Status. |
| * WIFEXITED: (libc)Process Completion Status. |
| * WIFSIGNALED: (libc)Process Completion Status. |
| * WIFSTOPPED: (libc)Process Completion Status. |
| * WSTOPSIG: (libc)Process Completion Status. |
| * WTERMSIG: (libc)Process Completion Status. |
| * W_OK: (libc)Testing File Access. |
| * X_OK: (libc)Testing File Access. |
| * _Complex_I: (libc)Complex Numbers. |
| * _Exit: (libc)Termination Internals. |
| * _IOFBF: (libc)Controlling Buffering. |
| * _IOLBF: (libc)Controlling Buffering. |
| * _IONBF: (libc)Controlling Buffering. |
| * _Imaginary_I: (libc)Complex Numbers. |
| * _PATH_UTMP: (libc)Manipulating the Database. |
| * _PATH_WTMP: (libc)Manipulating the Database. |
| * _POSIX2_C_DEV: (libc)System Options. |
| * _POSIX2_C_VERSION: (libc)Version Supported. |
| * _POSIX2_FORT_DEV: (libc)System Options. |
| * _POSIX2_FORT_RUN: (libc)System Options. |
| * _POSIX2_LOCALEDEF: (libc)System Options. |
| * _POSIX2_SW_DEV: (libc)System Options. |
| * _POSIX_CHOWN_RESTRICTED: (libc)Options for Files. |
| * _POSIX_JOB_CONTROL: (libc)System Options. |
| * _POSIX_NO_TRUNC: (libc)Options for Files. |
| * _POSIX_SAVED_IDS: (libc)System Options. |
| * _POSIX_VDISABLE: (libc)Options for Files. |
| * _POSIX_VERSION: (libc)Version Supported. |
| * __fbufsize: (libc)Controlling Buffering. |
| * __flbf: (libc)Controlling Buffering. |
| * __fpending: (libc)Controlling Buffering. |
| * __fpurge: (libc)Flushing Buffers. |
| * __freadable: (libc)Opening Streams. |
| * __freading: (libc)Opening Streams. |
| * __fsetlocking: (libc)Streams and Threads. |
| * __fwritable: (libc)Opening Streams. |
| * __fwriting: (libc)Opening Streams. |
| * __gconv_end_fct: (libc)glibc iconv Implementation. |
| * __gconv_fct: (libc)glibc iconv Implementation. |
| * __gconv_init_fct: (libc)glibc iconv Implementation. |
| * __ppc_get_timebase: (libc)PowerPC. |
| * __ppc_get_timebase_freq: (libc)PowerPC. |
| * __ppc_mdoio: (libc)PowerPC. |
| * __ppc_mdoom: (libc)PowerPC. |
| * __ppc_set_ppr_low: (libc)PowerPC. |
| * __ppc_set_ppr_med: (libc)PowerPC. |
| * __ppc_set_ppr_med_low: (libc)PowerPC. |
| * __ppc_yield: (libc)PowerPC. |
| * __va_copy: (libc)Argument Macros. |
| * _exit: (libc)Termination Internals. |
| * _flushlbf: (libc)Flushing Buffers. |
| * _tolower: (libc)Case Conversion. |
| * _toupper: (libc)Case Conversion. |
| * a64l: (libc)Encode Binary Data. |
| * abort: (libc)Aborting a Program. |
| * abs: (libc)Absolute Value. |
| * accept: (libc)Accepting Connections. |
| * access: (libc)Testing File Access. |
| * acos: (libc)Inverse Trig Functions. |
| * acosf: (libc)Inverse Trig Functions. |
| * acosh: (libc)Hyperbolic Functions. |
| * acoshf: (libc)Hyperbolic Functions. |
| * acoshl: (libc)Hyperbolic Functions. |
| * acosl: (libc)Inverse Trig Functions. |
| * addmntent: (libc)mtab. |
| * addseverity: (libc)Adding Severity Classes. |
| * adjtime: (libc)High-Resolution Calendar. |
| * adjtimex: (libc)High-Resolution Calendar. |
| * aio_cancel64: (libc)Cancel AIO Operations. |
| * aio_cancel: (libc)Cancel AIO Operations. |
| * aio_error64: (libc)Status of AIO Operations. |
| * aio_error: (libc)Status of AIO Operations. |
| * aio_fsync64: (libc)Synchronizing AIO Operations. |
| * aio_fsync: (libc)Synchronizing AIO Operations. |
| * aio_init: (libc)Configuration of AIO. |
| * aio_read64: (libc)Asynchronous Reads/Writes. |
| * aio_read: (libc)Asynchronous Reads/Writes. |
| * aio_return64: (libc)Status of AIO Operations. |
| * aio_return: (libc)Status of AIO Operations. |
| * aio_suspend64: (libc)Synchronizing AIO Operations. |
| * aio_suspend: (libc)Synchronizing AIO Operations. |
| * aio_write64: (libc)Asynchronous Reads/Writes. |
| * aio_write: (libc)Asynchronous Reads/Writes. |
| * alarm: (libc)Setting an Alarm. |
| * aligned_alloc: (libc)Aligned Memory Blocks. |
| * alloca: (libc)Variable Size Automatic. |
| * alphasort64: (libc)Scanning Directory Content. |
| * alphasort: (libc)Scanning Directory Content. |
| * argp_error: (libc)Argp Helper Functions. |
| * argp_failure: (libc)Argp Helper Functions. |
| * argp_help: (libc)Argp Help. |
| * argp_parse: (libc)Argp. |
| * argp_state_help: (libc)Argp Helper Functions. |
| * argp_usage: (libc)Argp Helper Functions. |
| * argz_add: (libc)Argz Functions. |
| * argz_add_sep: (libc)Argz Functions. |
| * argz_append: (libc)Argz Functions. |
| * argz_count: (libc)Argz Functions. |
| * argz_create: (libc)Argz Functions. |
| * argz_create_sep: (libc)Argz Functions. |
| * argz_delete: (libc)Argz Functions. |
| * argz_extract: (libc)Argz Functions. |
| * argz_insert: (libc)Argz Functions. |
| * argz_next: (libc)Argz Functions. |
| * argz_replace: (libc)Argz Functions. |
| * argz_stringify: (libc)Argz Functions. |
| * asctime: (libc)Formatting Calendar Time. |
| * asctime_r: (libc)Formatting Calendar Time. |
| * asin: (libc)Inverse Trig Functions. |
| * asinf: (libc)Inverse Trig Functions. |
| * asinh: (libc)Hyperbolic Functions. |
| * asinhf: (libc)Hyperbolic Functions. |
| * asinhl: (libc)Hyperbolic Functions. |
| * asinl: (libc)Inverse Trig Functions. |
| * asprintf: (libc)Dynamic Output. |
| * assert: (libc)Consistency Checking. |
| * assert_perror: (libc)Consistency Checking. |
| * atan2: (libc)Inverse Trig Functions. |
| * atan2f: (libc)Inverse Trig Functions. |
| * atan2l: (libc)Inverse Trig Functions. |
| * atan: (libc)Inverse Trig Functions. |
| * atanf: (libc)Inverse Trig Functions. |
| * atanh: (libc)Hyperbolic Functions. |
| * atanhf: (libc)Hyperbolic Functions. |
| * atanhl: (libc)Hyperbolic Functions. |
| * atanl: (libc)Inverse Trig Functions. |
| * atexit: (libc)Cleanups on Exit. |
| * atof: (libc)Parsing of Floats. |
| * atoi: (libc)Parsing of Integers. |
| * atol: (libc)Parsing of Integers. |
| * atoll: (libc)Parsing of Integers. |
| * backtrace: (libc)Backtraces. |
| * backtrace_symbols: (libc)Backtraces. |
| * backtrace_symbols_fd: (libc)Backtraces. |
| * basename: (libc)Finding Tokens in a String. |
| * basename: (libc)Finding Tokens in a String. |
| * bcmp: (libc)String/Array Comparison. |
| * bcopy: (libc)Copying and Concatenation. |
| * bind: (libc)Setting Address. |
| * bind_textdomain_codeset: (libc)Charset conversion in gettext. |
| * bindtextdomain: (libc)Locating gettext catalog. |
| * brk: (libc)Resizing the Data Segment. |
| * bsearch: (libc)Array Search Function. |
| * btowc: (libc)Converting a Character. |
| * bzero: (libc)Copying and Concatenation. |
| * cabs: (libc)Absolute Value. |
| * cabsf: (libc)Absolute Value. |
| * cabsl: (libc)Absolute Value. |
| * cacos: (libc)Inverse Trig Functions. |
| * cacosf: (libc)Inverse Trig Functions. |
| * cacosh: (libc)Hyperbolic Functions. |
| * cacoshf: (libc)Hyperbolic Functions. |
| * cacoshl: (libc)Hyperbolic Functions. |
| * cacosl: (libc)Inverse Trig Functions. |
| * calloc: (libc)Allocating Cleared Space. |
| * canonicalize_file_name: (libc)Symbolic Links. |
| * carg: (libc)Operations on Complex. |
| * cargf: (libc)Operations on Complex. |
| * cargl: (libc)Operations on Complex. |
| * casin: (libc)Inverse Trig Functions. |
| * casinf: (libc)Inverse Trig Functions. |
| * casinh: (libc)Hyperbolic Functions. |
| * casinhf: (libc)Hyperbolic Functions. |
| * casinhl: (libc)Hyperbolic Functions. |
| * casinl: (libc)Inverse Trig Functions. |
| * catan: (libc)Inverse Trig Functions. |
| * catanf: (libc)Inverse Trig Functions. |
| * catanh: (libc)Hyperbolic Functions. |
| * catanhf: (libc)Hyperbolic Functions. |
| * catanhl: (libc)Hyperbolic Functions. |
| * catanl: (libc)Inverse Trig Functions. |
| * catclose: (libc)The catgets Functions. |
| * catgets: (libc)The catgets Functions. |
| * catopen: (libc)The catgets Functions. |
| * cbc_crypt: (libc)DES Encryption. |
| * cbrt: (libc)Exponents and Logarithms. |
| * cbrtf: (libc)Exponents and Logarithms. |
| * cbrtl: (libc)Exponents and Logarithms. |
| * ccos: (libc)Trig Functions. |
| * ccosf: (libc)Trig Functions. |
| * ccosh: (libc)Hyperbolic Functions. |
| * ccoshf: (libc)Hyperbolic Functions. |
| * ccoshl: (libc)Hyperbolic Functions. |
| * ccosl: (libc)Trig Functions. |
| * ceil: (libc)Rounding Functions. |
| * ceilf: (libc)Rounding Functions. |
| * ceill: (libc)Rounding Functions. |
| * cexp: (libc)Exponents and Logarithms. |
| * cexpf: (libc)Exponents and Logarithms. |
| * cexpl: (libc)Exponents and Logarithms. |
| * cfgetispeed: (libc)Line Speed. |
| * cfgetospeed: (libc)Line Speed. |
| * cfmakeraw: (libc)Noncanonical Input. |
| * cfree: (libc)Freeing after Malloc. |
| * cfsetispeed: (libc)Line Speed. |
| * cfsetospeed: (libc)Line Speed. |
| * cfsetspeed: (libc)Line Speed. |
| * chdir: (libc)Working Directory. |
| * chmod: (libc)Setting Permissions. |
| * chown: (libc)File Owner. |
| * cimag: (libc)Operations on Complex. |
| * cimagf: (libc)Operations on Complex. |
| * cimagl: (libc)Operations on Complex. |
| * clearenv: (libc)Environment Access. |
| * clearerr: (libc)Error Recovery. |
| * clearerr_unlocked: (libc)Error Recovery. |
| * clock: (libc)CPU Time. |
| * clog10: (libc)Exponents and Logarithms. |
| * clog10f: (libc)Exponents and Logarithms. |
| * clog10l: (libc)Exponents and Logarithms. |
| * clog: (libc)Exponents and Logarithms. |
| * clogf: (libc)Exponents and Logarithms. |
| * clogl: (libc)Exponents and Logarithms. |
| * close: (libc)Opening and Closing Files. |
| * closedir: (libc)Reading/Closing Directory. |
| * closelog: (libc)closelog. |
| * confstr: (libc)String Parameters. |
| * conj: (libc)Operations on Complex. |
| * conjf: (libc)Operations on Complex. |
| * conjl: (libc)Operations on Complex. |
| * connect: (libc)Connecting. |
| * copysign: (libc)FP Bit Twiddling. |
| * copysignf: (libc)FP Bit Twiddling. |
| * copysignl: (libc)FP Bit Twiddling. |
| * cos: (libc)Trig Functions. |
| * cosf: (libc)Trig Functions. |
| * cosh: (libc)Hyperbolic Functions. |
| * coshf: (libc)Hyperbolic Functions. |
| * coshl: (libc)Hyperbolic Functions. |
| * cosl: (libc)Trig Functions. |
| * cpow: (libc)Exponents and Logarithms. |
| * cpowf: (libc)Exponents and Logarithms. |
| * cpowl: (libc)Exponents and Logarithms. |
| * cproj: (libc)Operations on Complex. |
| * cprojf: (libc)Operations on Complex. |
| * cprojl: (libc)Operations on Complex. |
| * creal: (libc)Operations on Complex. |
| * crealf: (libc)Operations on Complex. |
| * creall: (libc)Operations on Complex. |
| * creat64: (libc)Opening and Closing Files. |
| * creat: (libc)Opening and Closing Files. |
| * crypt: (libc)crypt. |
| * crypt_r: (libc)crypt. |
| * csin: (libc)Trig Functions. |
| * csinf: (libc)Trig Functions. |
| * csinh: (libc)Hyperbolic Functions. |
| * csinhf: (libc)Hyperbolic Functions. |
| * csinhl: (libc)Hyperbolic Functions. |
| * csinl: (libc)Trig Functions. |
| * csqrt: (libc)Exponents and Logarithms. |
| * csqrtf: (libc)Exponents and Logarithms. |
| * csqrtl: (libc)Exponents and Logarithms. |
| * ctan: (libc)Trig Functions. |
| * ctanf: (libc)Trig Functions. |
| * ctanh: (libc)Hyperbolic Functions. |
| * ctanhf: (libc)Hyperbolic Functions. |
| * ctanhl: (libc)Hyperbolic Functions. |
| * ctanl: (libc)Trig Functions. |
| * ctermid: (libc)Identifying the Terminal. |
| * ctime: (libc)Formatting Calendar Time. |
| * ctime_r: (libc)Formatting Calendar Time. |
| * cuserid: (libc)Who Logged In. |
| * dcgettext: (libc)Translation with gettext. |
| * dcngettext: (libc)Advanced gettext functions. |
| * des_setparity: (libc)DES Encryption. |
| * dgettext: (libc)Translation with gettext. |
| * difftime: (libc)Elapsed Time. |
| * dirfd: (libc)Opening a Directory. |
| * dirname: (libc)Finding Tokens in a String. |
| * div: (libc)Integer Division. |
| * dngettext: (libc)Advanced gettext functions. |
| * drand48: (libc)SVID Random. |
| * drand48_r: (libc)SVID Random. |
| * drem: (libc)Remainder Functions. |
| * dremf: (libc)Remainder Functions. |
| * dreml: (libc)Remainder Functions. |
| * dup2: (libc)Duplicating Descriptors. |
| * dup: (libc)Duplicating Descriptors. |
| * ecb_crypt: (libc)DES Encryption. |
| * ecvt: (libc)System V Number Conversion. |
| * ecvt_r: (libc)System V Number Conversion. |
| * encrypt: (libc)DES Encryption. |
| * encrypt_r: (libc)DES Encryption. |
| * endfsent: (libc)fstab. |
| * endgrent: (libc)Scanning All Groups. |
| * endhostent: (libc)Host Names. |
| * endmntent: (libc)mtab. |
| * endnetent: (libc)Networks Database. |
| * endnetgrent: (libc)Lookup Netgroup. |
| * endprotoent: (libc)Protocols Database. |
| * endpwent: (libc)Scanning All Users. |
| * endservent: (libc)Services Database. |
| * endutent: (libc)Manipulating the Database. |
| * endutxent: (libc)XPG Functions. |
| * envz_add: (libc)Envz Functions. |
| * envz_entry: (libc)Envz Functions. |
| * envz_get: (libc)Envz Functions. |
| * envz_merge: (libc)Envz Functions. |
| * envz_strip: (libc)Envz Functions. |
| * erand48: (libc)SVID Random. |
| * erand48_r: (libc)SVID Random. |
| * erf: (libc)Special Functions. |
| * erfc: (libc)Special Functions. |
| * erfcf: (libc)Special Functions. |
| * erfcl: (libc)Special Functions. |
| * erff: (libc)Special Functions. |
| * erfl: (libc)Special Functions. |
| * err: (libc)Error Messages. |
| * errno: (libc)Checking for Errors. |
| * error: (libc)Error Messages. |
| * error_at_line: (libc)Error Messages. |
| * errx: (libc)Error Messages. |
| * execl: (libc)Executing a File. |
| * execle: (libc)Executing a File. |
| * execlp: (libc)Executing a File. |
| * execv: (libc)Executing a File. |
| * execve: (libc)Executing a File. |
| * execvp: (libc)Executing a File. |
| * exit: (libc)Normal Termination. |
| * exp10: (libc)Exponents and Logarithms. |
| * exp10f: (libc)Exponents and Logarithms. |
| * exp10l: (libc)Exponents and Logarithms. |
| * exp2: (libc)Exponents and Logarithms. |
| * exp2f: (libc)Exponents and Logarithms. |
| * exp2l: (libc)Exponents and Logarithms. |
| * exp: (libc)Exponents and Logarithms. |
| * expf: (libc)Exponents and Logarithms. |
| * expl: (libc)Exponents and Logarithms. |
| * expm1: (libc)Exponents and Logarithms. |
| * expm1f: (libc)Exponents and Logarithms. |
| * expm1l: (libc)Exponents and Logarithms. |
| * fabs: (libc)Absolute Value. |
| * fabsf: (libc)Absolute Value. |
| * fabsl: (libc)Absolute Value. |
| * fchdir: (libc)Working Directory. |
| * fchmod: (libc)Setting Permissions. |
| * fchown: (libc)File Owner. |
| * fclose: (libc)Closing Streams. |
| * fcloseall: (libc)Closing Streams. |
| * fcntl: (libc)Control Operations. |
| * fcvt: (libc)System V Number Conversion. |
| * fcvt_r: (libc)System V Number Conversion. |
| * fdatasync: (libc)Synchronizing I/O. |
| * fdim: (libc)Misc FP Arithmetic. |
| * fdimf: (libc)Misc FP Arithmetic. |
| * fdiml: (libc)Misc FP Arithmetic. |
| * fdopen: (libc)Descriptors and Streams. |
| * fdopendir: (libc)Opening a Directory. |
| * feclearexcept: (libc)Status bit operations. |
| * fedisableexcept: (libc)Control Functions. |
| * feenableexcept: (libc)Control Functions. |
| * fegetenv: (libc)Control Functions. |
| * fegetexcept: (libc)Control Functions. |
| * fegetexceptflag: (libc)Status bit operations. |
| * fegetround: (libc)Rounding. |
| * feholdexcept: (libc)Control Functions. |
| * feof: (libc)EOF and Errors. |
| * feof_unlocked: (libc)EOF and Errors. |
| * feraiseexcept: (libc)Status bit operations. |
| * ferror: (libc)EOF and Errors. |
| * ferror_unlocked: (libc)EOF and Errors. |
| * fesetenv: (libc)Control Functions. |
| * fesetexceptflag: (libc)Status bit operations. |
| * fesetround: (libc)Rounding. |
| * fetestexcept: (libc)Status bit operations. |
| * feupdateenv: (libc)Control Functions. |
| * fflush: (libc)Flushing Buffers. |
| * fflush_unlocked: (libc)Flushing Buffers. |
| * fgetc: (libc)Character Input. |
| * fgetc_unlocked: (libc)Character Input. |
| * fgetgrent: (libc)Scanning All Groups. |
| * fgetgrent_r: (libc)Scanning All Groups. |
| * fgetpos64: (libc)Portable Positioning. |
| * fgetpos: (libc)Portable Positioning. |
| * fgetpwent: (libc)Scanning All Users. |
| * fgetpwent_r: (libc)Scanning All Users. |
| * fgets: (libc)Line Input. |
| * fgets_unlocked: (libc)Line Input. |
| * fgetwc: (libc)Character Input. |
| * fgetwc_unlocked: (libc)Character Input. |
| * fgetws: (libc)Line Input. |
| * fgetws_unlocked: (libc)Line Input. |
| * fileno: (libc)Descriptors and Streams. |
| * fileno_unlocked: (libc)Descriptors and Streams. |
| * finite: (libc)Floating Point Classes. |
| * finitef: (libc)Floating Point Classes. |
| * finitel: (libc)Floating Point Classes. |
| * flockfile: (libc)Streams and Threads. |
| * floor: (libc)Rounding Functions. |
| * floorf: (libc)Rounding Functions. |
| * floorl: (libc)Rounding Functions. |
| * fma: (libc)Misc FP Arithmetic. |
| * fmaf: (libc)Misc FP Arithmetic. |
| * fmal: (libc)Misc FP Arithmetic. |
| * fmax: (libc)Misc FP Arithmetic. |
| * fmaxf: (libc)Misc FP Arithmetic. |
| * fmaxl: (libc)Misc FP Arithmetic. |
| * fmemopen: (libc)String Streams. |
| * fmin: (libc)Misc FP Arithmetic. |
| * fminf: (libc)Misc FP Arithmetic. |
| * fminl: (libc)Misc FP Arithmetic. |
| * fmod: (libc)Remainder Functions. |
| * fmodf: (libc)Remainder Functions. |
| * fmodl: (libc)Remainder Functions. |
| * fmtmsg: (libc)Printing Formatted Messages. |
| * fnmatch: (libc)Wildcard Matching. |
| * fopen64: (libc)Opening Streams. |
| * fopen: (libc)Opening Streams. |
| * fopencookie: (libc)Streams and Cookies. |
| * fork: (libc)Creating a Process. |
| * forkpty: (libc)Pseudo-Terminal Pairs. |
| * fpathconf: (libc)Pathconf. |
| * fpclassify: (libc)Floating Point Classes. |
| * fprintf: (libc)Formatted Output Functions. |
| * fputc: (libc)Simple Output. |
| * fputc_unlocked: (libc)Simple Output. |
| * fputs: (libc)Simple Output. |
| * fputs_unlocked: (libc)Simple Output. |
| * fputwc: (libc)Simple Output. |
| * fputwc_unlocked: (libc)Simple Output. |
| * fputws: (libc)Simple Output. |
| * fputws_unlocked: (libc)Simple Output. |
| * fread: (libc)Block Input/Output. |
| * fread_unlocked: (libc)Block Input/Output. |
| * free: (libc)Freeing after Malloc. |
| * freopen64: (libc)Opening Streams. |
| * freopen: (libc)Opening Streams. |
| * frexp: (libc)Normalization Functions. |
| * frexpf: (libc)Normalization Functions. |
| * frexpl: (libc)Normalization Functions. |
| * fscanf: (libc)Formatted Input Functions. |
| * fseek: (libc)File Positioning. |
| * fseeko64: (libc)File Positioning. |
| * fseeko: (libc)File Positioning. |
| * fsetpos64: (libc)Portable Positioning. |
| * fsetpos: (libc)Portable Positioning. |
| * fstat64: (libc)Reading Attributes. |
| * fstat: (libc)Reading Attributes. |
| * fsync: (libc)Synchronizing I/O. |
| * ftell: (libc)File Positioning. |
| * ftello64: (libc)File Positioning. |
| * ftello: (libc)File Positioning. |
| * ftruncate64: (libc)File Size. |
| * ftruncate: (libc)File Size. |
| * ftrylockfile: (libc)Streams and Threads. |
| * ftw64: (libc)Working with Directory Trees. |
| * ftw: (libc)Working with Directory Trees. |
| * funlockfile: (libc)Streams and Threads. |
| * futimes: (libc)File Times. |
| * fwide: (libc)Streams and I18N. |
| * fwprintf: (libc)Formatted Output Functions. |
| * fwrite: (libc)Block Input/Output. |
| * fwrite_unlocked: (libc)Block Input/Output. |
| * fwscanf: (libc)Formatted Input Functions. |
| * gamma: (libc)Special Functions. |
| * gammaf: (libc)Special Functions. |
| * gammal: (libc)Special Functions. |
| * gcvt: (libc)System V Number Conversion. |
| * get_avphys_pages: (libc)Query Memory Parameters. |
| * get_current_dir_name: (libc)Working Directory. |
| * get_nprocs: (libc)Processor Resources. |
| * get_nprocs_conf: (libc)Processor Resources. |
| * get_phys_pages: (libc)Query Memory Parameters. |
| * getauxval: (libc)Auxiliary Vector. |
| * getc: (libc)Character Input. |
| * getc_unlocked: (libc)Character Input. |
| * getchar: (libc)Character Input. |
| * getchar_unlocked: (libc)Character Input. |
| * getcontext: (libc)System V contexts. |
| * getcwd: (libc)Working Directory. |
| * getdate: (libc)General Time String Parsing. |
| * getdate_r: (libc)General Time String Parsing. |
| * getdelim: (libc)Line Input. |
| * getdomainnname: (libc)Host Identification. |
| * getegid: (libc)Reading Persona. |
| * getenv: (libc)Environment Access. |
| * geteuid: (libc)Reading Persona. |
| * getfsent: (libc)fstab. |
| * getfsfile: (libc)fstab. |
| * getfsspec: (libc)fstab. |
| * getgid: (libc)Reading Persona. |
| * getgrent: (libc)Scanning All Groups. |
| * getgrent_r: (libc)Scanning All Groups. |
| * getgrgid: (libc)Lookup Group. |
| * getgrgid_r: (libc)Lookup Group. |
| * getgrnam: (libc)Lookup Group. |
| * getgrnam_r: (libc)Lookup Group. |
| * getgrouplist: (libc)Setting Groups. |
| * getgroups: (libc)Reading Persona. |
| * gethostbyaddr: (libc)Host Names. |
| * gethostbyaddr_r: (libc)Host Names. |
| * gethostbyname2: (libc)Host Names. |
| * gethostbyname2_r: (libc)Host Names. |
| * gethostbyname: (libc)Host Names. |
| * gethostbyname_r: (libc)Host Names. |
| * gethostent: (libc)Host Names. |
| * gethostid: (libc)Host Identification. |
| * gethostname: (libc)Host Identification. |
| * getitimer: (libc)Setting an Alarm. |
| * getline: (libc)Line Input. |
| * getloadavg: (libc)Processor Resources. |
| * getlogin: (libc)Who Logged In. |
| * getmntent: (libc)mtab. |
| * getmntent_r: (libc)mtab. |
| * getnetbyaddr: (libc)Networks Database. |
| * getnetbyname: (libc)Networks Database. |
| * getnetent: (libc)Networks Database. |
| * getnetgrent: (libc)Lookup Netgroup. |
| * getnetgrent_r: (libc)Lookup Netgroup. |
| * getopt: (libc)Using Getopt. |
| * getopt_long: (libc)Getopt Long Options. |
| * getopt_long_only: (libc)Getopt Long Options. |
| * getpagesize: (libc)Query Memory Parameters. |
| * getpass: (libc)getpass. |
| * getpeername: (libc)Who is Connected. |
| * getpgid: (libc)Process Group Functions. |
| * getpgrp: (libc)Process Group Functions. |
| * getpid: (libc)Process Identification. |
| * getppid: (libc)Process Identification. |
| * getpriority: (libc)Traditional Scheduling Functions. |
| * getprotobyname: (libc)Protocols Database. |
| * getprotobynumber: (libc)Protocols Database. |
| * getprotoent: (libc)Protocols Database. |
| * getpt: (libc)Allocation. |
| * getpwent: (libc)Scanning All Users. |
| * getpwent_r: (libc)Scanning All Users. |
| * getpwnam: (libc)Lookup User. |
| * getpwnam_r: (libc)Lookup User. |
| * getpwuid: (libc)Lookup User. |
| * getpwuid_r: (libc)Lookup User. |
| * getrlimit64: (libc)Limits on Resources. |
| * getrlimit: (libc)Limits on Resources. |
| * getrusage: (libc)Resource Usage. |
| * gets: (libc)Line Input. |
| * getservbyname: (libc)Services Database. |
| * getservbyport: (libc)Services Database. |
| * getservent: (libc)Services Database. |
| * getsid: (libc)Process Group Functions. |
| * getsockname: (libc)Reading Address. |
| * getsockopt: (libc)Socket Option Functions. |
| * getsubopt: (libc)Suboptions. |
| * gettext: (libc)Translation with gettext. |
| * gettimeofday: (libc)High-Resolution Calendar. |
| * getuid: (libc)Reading Persona. |
| * getumask: (libc)Setting Permissions. |
| * getutent: (libc)Manipulating the Database. |
| * getutent_r: (libc)Manipulating the Database. |
| * getutid: (libc)Manipulating the Database. |
| * getutid_r: (libc)Manipulating the Database. |
| * getutline: (libc)Manipulating the Database. |
| * getutline_r: (libc)Manipulating the Database. |
| * getutmp: (libc)XPG Functions. |
| * getutmpx: (libc)XPG Functions. |
| * getutxent: (libc)XPG Functions. |
| * getutxid: (libc)XPG Functions. |
| * getutxline: (libc)XPG Functions. |
| * getw: (libc)Character Input. |
| * getwc: (libc)Character Input. |
| * getwc_unlocked: (libc)Character Input. |
| * getwchar: (libc)Character Input. |
| * getwchar_unlocked: (libc)Character Input. |
| * getwd: (libc)Working Directory. |
| * glob64: (libc)Calling Glob. |
| * glob: (libc)Calling Glob. |
| * globfree64: (libc)More Flags for Globbing. |
| * globfree: (libc)More Flags for Globbing. |
| * gmtime: (libc)Broken-down Time. |
| * gmtime_r: (libc)Broken-down Time. |
| * grantpt: (libc)Allocation. |
| * gsignal: (libc)Signaling Yourself. |
| * gtty: (libc)BSD Terminal Modes. |
| * hasmntopt: (libc)mtab. |
| * hcreate: (libc)Hash Search Function. |
| * hcreate_r: (libc)Hash Search Function. |
| * hdestroy: (libc)Hash Search Function. |
| * hdestroy_r: (libc)Hash Search Function. |
| * hsearch: (libc)Hash Search Function. |
| * hsearch_r: (libc)Hash Search Function. |
| * htonl: (libc)Byte Order. |
| * htons: (libc)Byte Order. |
| * hypot: (libc)Exponents and Logarithms. |
| * hypotf: (libc)Exponents and Logarithms. |
| * hypotl: (libc)Exponents and Logarithms. |
| * iconv: (libc)Generic Conversion Interface. |
| * iconv_close: (libc)Generic Conversion Interface. |
| * iconv_open: (libc)Generic Conversion Interface. |
| * if_freenameindex: (libc)Interface Naming. |
| * if_indextoname: (libc)Interface Naming. |
| * if_nameindex: (libc)Interface Naming. |
| * if_nametoindex: (libc)Interface Naming. |
| * ilogb: (libc)Exponents and Logarithms. |
| * ilogbf: (libc)Exponents and Logarithms. |
| * ilogbl: (libc)Exponents and Logarithms. |
| * imaxabs: (libc)Absolute Value. |
| * imaxdiv: (libc)Integer Division. |
| * in6addr_any: (libc)Host Address Data Type. |
| * in6addr_loopback: (libc)Host Address Data Type. |
| * index: (libc)Search Functions. |
| * inet_addr: (libc)Host Address Functions. |
| * inet_aton: (libc)Host Address Functions. |
| * inet_lnaof: (libc)Host Address Functions. |
| * inet_makeaddr: (libc)Host Address Functions. |
| * inet_netof: (libc)Host Address Functions. |
| * inet_network: (libc)Host Address Functions. |
| * inet_ntoa: (libc)Host Address Functions. |
| * inet_ntop: (libc)Host Address Functions. |
| * inet_pton: (libc)Host Address Functions. |
| * initgroups: (libc)Setting Groups. |
| * initstate: (libc)BSD Random. |
| * initstate_r: (libc)BSD Random. |
| * innetgr: (libc)Netgroup Membership. |
| * ioctl: (libc)IOCTLs. |
| * isalnum: (libc)Classification of Characters. |
| * isalpha: (libc)Classification of Characters. |
| * isascii: (libc)Classification of Characters. |
| * isatty: (libc)Is It a Terminal. |
| * isblank: (libc)Classification of Characters. |
| * iscntrl: (libc)Classification of Characters. |
| * isdigit: (libc)Classification of Characters. |
| * isfinite: (libc)Floating Point Classes. |
| * isgraph: (libc)Classification of Characters. |
| * isgreater: (libc)FP Comparison Functions. |
| * isgreaterequal: (libc)FP Comparison Functions. |
| * isinf: (libc)Floating Point Classes. |
| * isinff: (libc)Floating Point Classes. |
| * isinfl: (libc)Floating Point Classes. |
| * isless: (libc)FP Comparison Functions. |
| * islessequal: (libc)FP Comparison Functions. |
| * islessgreater: (libc)FP Comparison Functions. |
| * islower: (libc)Classification of Characters. |
| * isnan: (libc)Floating Point Classes. |
| * isnan: (libc)Floating Point Classes. |
| * isnanf: (libc)Floating Point Classes. |
| * isnanl: (libc)Floating Point Classes. |
| * isnormal: (libc)Floating Point Classes. |
| * isprint: (libc)Classification of Characters. |
| * ispunct: (libc)Classification of Characters. |
| * issignaling: (libc)Floating Point Classes. |
| * isspace: (libc)Classification of Characters. |
| * isunordered: (libc)FP Comparison Functions. |
| * isupper: (libc)Classification of Characters. |
| * iswalnum: (libc)Classification of Wide Characters. |
| * iswalpha: (libc)Classification of Wide Characters. |
| * iswblank: (libc)Classification of Wide Characters. |
| * iswcntrl: (libc)Classification of Wide Characters. |
| * iswctype: (libc)Classification of Wide Characters. |
| * iswdigit: (libc)Classification of Wide Characters. |
| * iswgraph: (libc)Classification of Wide Characters. |
| * iswlower: (libc)Classification of Wide Characters. |
| * iswprint: (libc)Classification of Wide Characters. |
| * iswpunct: (libc)Classification of Wide Characters. |
| * iswspace: (libc)Classification of Wide Characters. |
| * iswupper: (libc)Classification of Wide Characters. |
| * iswxdigit: (libc)Classification of Wide Characters. |
| * isxdigit: (libc)Classification of Characters. |
| * j0: (libc)Special Functions. |
| * j0f: (libc)Special Functions. |
| * j0l: (libc)Special Functions. |
| * j1: (libc)Special Functions. |
| * j1f: (libc)Special Functions. |
| * j1l: (libc)Special Functions. |
| * jn: (libc)Special Functions. |
| * jnf: (libc)Special Functions. |
| * jnl: (libc)Special Functions. |
| * jrand48: (libc)SVID Random. |
| * jrand48_r: (libc)SVID Random. |
| * kill: (libc)Signaling Another Process. |
| * killpg: (libc)Signaling Another Process. |
| * l64a: (libc)Encode Binary Data. |
| * labs: (libc)Absolute Value. |
| * lcong48: (libc)SVID Random. |
| * lcong48_r: (libc)SVID Random. |
| * ldexp: (libc)Normalization Functions. |
| * ldexpf: (libc)Normalization Functions. |
| * ldexpl: (libc)Normalization Functions. |
| * ldiv: (libc)Integer Division. |
| * lfind: (libc)Array Search Function. |
| * lgamma: (libc)Special Functions. |
| * lgamma_r: (libc)Special Functions. |
| * lgammaf: (libc)Special Functions. |
| * lgammaf_r: (libc)Special Functions. |
| * lgammal: (libc)Special Functions. |
| * lgammal_r: (libc)Special Functions. |
| * link: (libc)Hard Links. |
| * lio_listio64: (libc)Asynchronous Reads/Writes. |
| * lio_listio: (libc)Asynchronous Reads/Writes. |
| * listen: (libc)Listening. |
| * llabs: (libc)Absolute Value. |
| * lldiv: (libc)Integer Division. |
| * llrint: (libc)Rounding Functions. |
| * llrintf: (libc)Rounding Functions. |
| * llrintl: (libc)Rounding Functions. |
| * llround: (libc)Rounding Functions. |
| * llroundf: (libc)Rounding Functions. |
| * llroundl: (libc)Rounding Functions. |
| * localeconv: (libc)The Lame Way to Locale Data. |
| * localtime: (libc)Broken-down Time. |
| * localtime_r: (libc)Broken-down Time. |
| * log10: (libc)Exponents and Logarithms. |
| * log10f: (libc)Exponents and Logarithms. |
| * log10l: (libc)Exponents and Logarithms. |
| * log1p: (libc)Exponents and Logarithms. |
| * log1pf: (libc)Exponents and Logarithms. |
| * log1pl: (libc)Exponents and Logarithms. |
| * log2: (libc)Exponents and Logarithms. |
| * log2f: (libc)Exponents and Logarithms. |
| * log2l: (libc)Exponents and Logarithms. |
| * log: (libc)Exponents and Logarithms. |
| * logb: (libc)Exponents and Logarithms. |
| * logbf: (libc)Exponents and Logarithms. |
| * logbl: (libc)Exponents and Logarithms. |
| * logf: (libc)Exponents and Logarithms. |
| * login: (libc)Logging In and Out. |
| * login_tty: (libc)Logging In and Out. |
| * logl: (libc)Exponents and Logarithms. |
| * logout: (libc)Logging In and Out. |
| * logwtmp: (libc)Logging In and Out. |
| * longjmp: (libc)Non-Local Details. |
| * lrand48: (libc)SVID Random. |
| * lrand48_r: (libc)SVID Random. |
| * lrint: (libc)Rounding Functions. |
| * lrintf: (libc)Rounding Functions. |
| * lrintl: (libc)Rounding Functions. |
| * lround: (libc)Rounding Functions. |
| * lroundf: (libc)Rounding Functions. |
| * lroundl: (libc)Rounding Functions. |
| * lsearch: (libc)Array Search Function. |
| * lseek64: (libc)File Position Primitive. |
| * lseek: (libc)File Position Primitive. |
| * lstat64: (libc)Reading Attributes. |
| * lstat: (libc)Reading Attributes. |
| * lutimes: (libc)File Times. |
| * madvise: (libc)Memory-mapped I/O. |
| * makecontext: (libc)System V contexts. |
| * mallinfo: (libc)Statistics of Malloc. |
| * malloc: (libc)Basic Allocation. |
| * mallopt: (libc)Malloc Tunable Parameters. |
| * mblen: (libc)Non-reentrant Character Conversion. |
| * mbrlen: (libc)Converting a Character. |
| * mbrtowc: (libc)Converting a Character. |
| * mbsinit: (libc)Keeping the state. |
| * mbsnrtowcs: (libc)Converting Strings. |
| * mbsrtowcs: (libc)Converting Strings. |
| * mbstowcs: (libc)Non-reentrant String Conversion. |
| * mbtowc: (libc)Non-reentrant Character Conversion. |
| * mcheck: (libc)Heap Consistency Checking. |
| * memalign: (libc)Aligned Memory Blocks. |
| * memccpy: (libc)Copying and Concatenation. |
| * memchr: (libc)Search Functions. |
| * memcmp: (libc)String/Array Comparison. |
| * memcpy: (libc)Copying and Concatenation. |
| * memfrob: (libc)Trivial Encryption. |
| * memmem: (libc)Search Functions. |
| * memmove: (libc)Copying and Concatenation. |
| * mempcpy: (libc)Copying and Concatenation. |
| * memrchr: (libc)Search Functions. |
| * memset: (libc)Copying and Concatenation. |
| * mkdir: (libc)Creating Directories. |
| * mkdtemp: (libc)Temporary Files. |
| * mkfifo: (libc)FIFO Special Files. |
| * mknod: (libc)Making Special Files. |
| * mkstemp: (libc)Temporary Files. |
| * mktemp: (libc)Temporary Files. |
| * mktime: (libc)Broken-down Time. |
| * mlock: (libc)Page Lock Functions. |
| * mlockall: (libc)Page Lock Functions. |
| * mmap64: (libc)Memory-mapped I/O. |
| * mmap: (libc)Memory-mapped I/O. |
| * modf: (libc)Rounding Functions. |
| * modff: (libc)Rounding Functions. |
| * modfl: (libc)Rounding Functions. |
| * mount: (libc)Mount-Unmount-Remount. |
| * mprobe: (libc)Heap Consistency Checking. |
| * mrand48: (libc)SVID Random. |
| * mrand48_r: (libc)SVID Random. |
| * mremap: (libc)Memory-mapped I/O. |
| * msync: (libc)Memory-mapped I/O. |
| * mtrace: (libc)Tracing malloc. |
| * munlock: (libc)Page Lock Functions. |
| * munlockall: (libc)Page Lock Functions. |
| * munmap: (libc)Memory-mapped I/O. |
| * muntrace: (libc)Tracing malloc. |
| * nan: (libc)FP Bit Twiddling. |
| * nanf: (libc)FP Bit Twiddling. |
| * nanl: (libc)FP Bit Twiddling. |
| * nanosleep: (libc)Sleeping. |
| * nearbyint: (libc)Rounding Functions. |
| * nearbyintf: (libc)Rounding Functions. |
| * nearbyintl: (libc)Rounding Functions. |
| * nextafter: (libc)FP Bit Twiddling. |
| * nextafterf: (libc)FP Bit Twiddling. |
| * nextafterl: (libc)FP Bit Twiddling. |
| * nexttoward: (libc)FP Bit Twiddling. |
| * nexttowardf: (libc)FP Bit Twiddling. |
| * nexttowardl: (libc)FP Bit Twiddling. |
| * nftw64: (libc)Working with Directory Trees. |
| * nftw: (libc)Working with Directory Trees. |
| * ngettext: (libc)Advanced gettext functions. |
| * nice: (libc)Traditional Scheduling Functions. |
| * nl_langinfo: (libc)The Elegant and Fast Way. |
| * nrand48: (libc)SVID Random. |
| * nrand48_r: (libc)SVID Random. |
| * ntohl: (libc)Byte Order. |
| * ntohs: (libc)Byte Order. |
| * ntp_adjtime: (libc)High Accuracy Clock. |
| * ntp_gettime: (libc)High Accuracy Clock. |
| * obstack_1grow: (libc)Growing Objects. |
| * obstack_1grow_fast: (libc)Extra Fast Growing. |
| * obstack_alignment_mask: (libc)Obstacks Data Alignment. |
| * obstack_alloc: (libc)Allocation in an Obstack. |
| * obstack_base: (libc)Status of an Obstack. |
| * obstack_blank: (libc)Growing Objects. |
| * obstack_blank_fast: (libc)Extra Fast Growing. |
| * obstack_chunk_size: (libc)Obstack Chunks. |
| * obstack_copy0: (libc)Allocation in an Obstack. |
| * obstack_copy: (libc)Allocation in an Obstack. |
| * obstack_finish: (libc)Growing Objects. |
| * obstack_free: (libc)Freeing Obstack Objects. |
| * obstack_grow0: (libc)Growing Objects. |
| * obstack_grow: (libc)Growing Objects. |
| * obstack_init: (libc)Preparing for Obstacks. |
| * obstack_int_grow: (libc)Growing Objects. |
| * obstack_int_grow_fast: (libc)Extra Fast Growing. |
| * obstack_next_free: (libc)Status of an Obstack. |
| * obstack_object_size: (libc)Growing Objects. |
| * obstack_object_size: (libc)Status of an Obstack. |
| * obstack_printf: (libc)Dynamic Output. |
| * obstack_ptr_grow: (libc)Growing Objects. |
| * obstack_ptr_grow_fast: (libc)Extra Fast Growing. |
| * obstack_room: (libc)Extra Fast Growing. |
| * obstack_vprintf: (libc)Variable Arguments Output. |
| * offsetof: (libc)Structure Measurement. |
| * on_exit: (libc)Cleanups on Exit. |
| * open64: (libc)Opening and Closing Files. |
| * open: (libc)Opening and Closing Files. |
| * open_memstream: (libc)String Streams. |
| * opendir: (libc)Opening a Directory. |
| * openlog: (libc)openlog. |
| * openpty: (libc)Pseudo-Terminal Pairs. |
| * parse_printf_format: (libc)Parsing a Template String. |
| * pathconf: (libc)Pathconf. |
| * pause: (libc)Using Pause. |
| * pclose: (libc)Pipe to a Subprocess. |
| * perror: (libc)Error Messages. |
| * pipe: (libc)Creating a Pipe. |
| * popen: (libc)Pipe to a Subprocess. |
| * posix_memalign: (libc)Aligned Memory Blocks. |
| * pow10: (libc)Exponents and Logarithms. |
| * pow10f: (libc)Exponents and Logarithms. |
| * pow10l: (libc)Exponents and Logarithms. |
| * pow: (libc)Exponents and Logarithms. |
| * powf: (libc)Exponents and Logarithms. |
| * powl: (libc)Exponents and Logarithms. |
| * pread64: (libc)I/O Primitives. |
| * pread: (libc)I/O Primitives. |
| * printf: (libc)Formatted Output Functions. |
| * printf_size: (libc)Predefined Printf Handlers. |
| * printf_size_info: (libc)Predefined Printf Handlers. |
| * psignal: (libc)Signal Messages. |
| * pthread_getattr_default_np: (libc)Default Thread Attributes. |
| * pthread_getspecific: (libc)Thread-specific Data. |
| * pthread_key_create: (libc)Thread-specific Data. |
| * pthread_key_delete: (libc)Thread-specific Data. |
| * pthread_setattr_default_np: (libc)Default Thread Attributes. |
| * pthread_setspecific: (libc)Thread-specific Data. |
| * ptsname: (libc)Allocation. |
| * ptsname_r: (libc)Allocation. |
| * putc: (libc)Simple Output. |
| * putc_unlocked: (libc)Simple Output. |
| * putchar: (libc)Simple Output. |
| * putchar_unlocked: (libc)Simple Output. |
| * putenv: (libc)Environment Access. |
| * putpwent: (libc)Writing a User Entry. |
| * puts: (libc)Simple Output. |
| * pututline: (libc)Manipulating the Database. |
| * pututxline: (libc)XPG Functions. |
| * putw: (libc)Simple Output. |
| * putwc: (libc)Simple Output. |
| * putwc_unlocked: (libc)Simple Output. |
| * putwchar: (libc)Simple Output. |
| * putwchar_unlocked: (libc)Simple Output. |
| * pwrite64: (libc)I/O Primitives. |
| * pwrite: (libc)I/O Primitives. |
| * qecvt: (libc)System V Number Conversion. |
| * qecvt_r: (libc)System V Number Conversion. |
| * qfcvt: (libc)System V Number Conversion. |
| * qfcvt_r: (libc)System V Number Conversion. |
| * qgcvt: (libc)System V Number Conversion. |
| * qsort: (libc)Array Sort Function. |
| * raise: (libc)Signaling Yourself. |
| * rand: (libc)ISO Random. |
| * rand_r: (libc)ISO Random. |
| * random: (libc)BSD Random. |
| * random_r: (libc)BSD Random. |
| * rawmemchr: (libc)Search Functions. |
| * read: (libc)I/O Primitives. |
| * readdir64: (libc)Reading/Closing Directory. |
| * readdir64_r: (libc)Reading/Closing Directory. |
| * readdir: (libc)Reading/Closing Directory. |
| * readdir_r: (libc)Reading/Closing Directory. |
| * readlink: (libc)Symbolic Links. |
| * readv: (libc)Scatter-Gather. |
| * realloc: (libc)Changing Block Size. |
| * realpath: (libc)Symbolic Links. |
| * recv: (libc)Receiving Data. |
| * recvfrom: (libc)Receiving Datagrams. |
| * recvmsg: (libc)Receiving Datagrams. |
| * regcomp: (libc)POSIX Regexp Compilation. |
| * regerror: (libc)Regexp Cleanup. |
| * regexec: (libc)Matching POSIX Regexps. |
| * regfree: (libc)Regexp Cleanup. |
| * register_printf_function: (libc)Registering New Conversions. |
| * remainder: (libc)Remainder Functions. |
| * remainderf: (libc)Remainder Functions. |
| * remainderl: (libc)Remainder Functions. |
| * remove: (libc)Deleting Files. |
| * rename: (libc)Renaming Files. |
| * rewind: (libc)File Positioning. |
| * rewinddir: (libc)Random Access Directory. |
| * rindex: (libc)Search Functions. |
| * rint: (libc)Rounding Functions. |
| * rintf: (libc)Rounding Functions. |
| * rintl: (libc)Rounding Functions. |
| * rmdir: (libc)Deleting Files. |
| * round: (libc)Rounding Functions. |
| * roundf: (libc)Rounding Functions. |
| * roundl: (libc)Rounding Functions. |
| * rpmatch: (libc)Yes-or-No Questions. |
| * sbrk: (libc)Resizing the Data Segment. |
| * scalb: (libc)Normalization Functions. |
| * scalbf: (libc)Normalization Functions. |
| * scalbl: (libc)Normalization Functions. |
| * scalbln: (libc)Normalization Functions. |
| * scalblnf: (libc)Normalization Functions. |
| * scalblnl: (libc)Normalization Functions. |
| * scalbn: (libc)Normalization Functions. |
| * scalbnf: (libc)Normalization Functions. |
| * scalbnl: (libc)Normalization Functions. |
| * scandir64: (libc)Scanning Directory Content. |
| * scandir: (libc)Scanning Directory Content. |
| * scanf: (libc)Formatted Input Functions. |
| * sched_get_priority_max: (libc)Basic Scheduling Functions. |
| * sched_get_priority_min: (libc)Basic Scheduling Functions. |
| * sched_getaffinity: (libc)CPU Affinity. |
| * sched_getparam: (libc)Basic Scheduling Functions. |
| * sched_getscheduler: (libc)Basic Scheduling Functions. |
| * sched_rr_get_interval: (libc)Basic Scheduling Functions. |
| * sched_setaffinity: (libc)CPU Affinity. |
| * sched_setparam: (libc)Basic Scheduling Functions. |
| * sched_setscheduler: (libc)Basic Scheduling Functions. |
| * sched_yield: (libc)Basic Scheduling Functions. |
| * secure_getenv: (libc)Environment Access. |
| * seed48: (libc)SVID Random. |
| * seed48_r: (libc)SVID Random. |
| * seekdir: (libc)Random Access Directory. |
| * select: (libc)Waiting for I/O. |
| * send: (libc)Sending Data. |
| * sendmsg: (libc)Receiving Datagrams. |
| * sendto: (libc)Sending Datagrams. |
| * setbuf: (libc)Controlling Buffering. |
| * setbuffer: (libc)Controlling Buffering. |
| * setcontext: (libc)System V contexts. |
| * setdomainname: (libc)Host Identification. |
| * setegid: (libc)Setting Groups. |
| * setenv: (libc)Environment Access. |
| * seteuid: (libc)Setting User ID. |
| * setfsent: (libc)fstab. |
| * setgid: (libc)Setting Groups. |
| * setgrent: (libc)Scanning All Groups. |
| * setgroups: (libc)Setting Groups. |
| * sethostent: (libc)Host Names. |
| * sethostid: (libc)Host Identification. |
| * sethostname: (libc)Host Identification. |
| * setitimer: (libc)Setting an Alarm. |
| * setjmp: (libc)Non-Local Details. |
| * setkey: (libc)DES Encryption. |
| * setkey_r: (libc)DES Encryption. |
| * setlinebuf: (libc)Controlling Buffering. |
| * setlocale: (libc)Setting the Locale. |
| * setlogmask: (libc)setlogmask. |
| * setmntent: (libc)mtab. |
| * setnetent: (libc)Networks Database. |
| * setnetgrent: (libc)Lookup Netgroup. |
| * setpgid: (libc)Process Group Functions. |
| * setpgrp: (libc)Process Group Functions. |
| * setpriority: (libc)Traditional Scheduling Functions. |
| * setprotoent: (libc)Protocols Database. |
| * setpwent: (libc)Scanning All Users. |
| * setregid: (libc)Setting Groups. |
| * setreuid: (libc)Setting User ID. |
| * setrlimit64: (libc)Limits on Resources. |
| * setrlimit: (libc)Limits on Resources. |
| * setservent: (libc)Services Database. |
| * setsid: (libc)Process Group Functions. |
| * setsockopt: (libc)Socket Option Functions. |
| * setstate: (libc)BSD Random. |
| * setstate_r: (libc)BSD Random. |
| * settimeofday: (libc)High-Resolution Calendar. |
| * setuid: (libc)Setting User ID. |
| * setutent: (libc)Manipulating the Database. |
| * setutxent: (libc)XPG Functions. |
| * setvbuf: (libc)Controlling Buffering. |
| * shm_open: (libc)Memory-mapped I/O. |
| * shm_unlink: (libc)Memory-mapped I/O. |
| * shutdown: (libc)Closing a Socket. |
| * sigaction: (libc)Advanced Signal Handling. |
| * sigaddset: (libc)Signal Sets. |
| * sigaltstack: (libc)Signal Stack. |
| * sigblock: (libc)Blocking in BSD. |
| * sigdelset: (libc)Signal Sets. |
| * sigemptyset: (libc)Signal Sets. |
| * sigfillset: (libc)Signal Sets. |
| * siginterrupt: (libc)BSD Handler. |
| * sigismember: (libc)Signal Sets. |
| * siglongjmp: (libc)Non-Local Exits and Signals. |
| * sigmask: (libc)Blocking in BSD. |
| * signal: (libc)Basic Signal Handling. |
| * signbit: (libc)FP Bit Twiddling. |
| * significand: (libc)Normalization Functions. |
| * significandf: (libc)Normalization Functions. |
| * significandl: (libc)Normalization Functions. |
| * sigpause: (libc)Blocking in BSD. |
| * sigpending: (libc)Checking for Pending Signals. |
| * sigprocmask: (libc)Process Signal Mask. |
| * sigsetjmp: (libc)Non-Local Exits and Signals. |
| * sigsetmask: (libc)Blocking in BSD. |
| * sigstack: (libc)Signal Stack. |
| * sigsuspend: (libc)Sigsuspend. |
| * sigvec: (libc)BSD Handler. |
| * sin: (libc)Trig Functions. |
| * sincos: (libc)Trig Functions. |
| * sincosf: (libc)Trig Functions. |
| * sincosl: (libc)Trig Functions. |
| * sinf: (libc)Trig Functions. |
| * sinh: (libc)Hyperbolic Functions. |
| * sinhf: (libc)Hyperbolic Functions. |
| * sinhl: (libc)Hyperbolic Functions. |
| * sinl: (libc)Trig Functions. |
| * sleep: (libc)Sleeping. |
| * snprintf: (libc)Formatted Output Functions. |
| * socket: (libc)Creating a Socket. |
| * socketpair: (libc)Socket Pairs. |
| * sprintf: (libc)Formatted Output Functions. |
| * sqrt: (libc)Exponents and Logarithms. |
| * sqrtf: (libc)Exponents and Logarithms. |
| * sqrtl: (libc)Exponents and Logarithms. |
| * srand48: (libc)SVID Random. |
| * srand48_r: (libc)SVID Random. |
| * srand: (libc)ISO Random. |
| * srandom: (libc)BSD Random. |
| * srandom_r: (libc)BSD Random. |
| * sscanf: (libc)Formatted Input Functions. |
| * ssignal: (libc)Basic Signal Handling. |
| * stat64: (libc)Reading Attributes. |
| * stat: (libc)Reading Attributes. |
| * stime: (libc)Simple Calendar Time. |
| * stpcpy: (libc)Copying and Concatenation. |
| * stpncpy: (libc)Copying and Concatenation. |
| * strcasecmp: (libc)String/Array Comparison. |
| * strcasestr: (libc)Search Functions. |
| * strcat: (libc)Copying and Concatenation. |
| * strchr: (libc)Search Functions. |
| * strchrnul: (libc)Search Functions. |
| * strcmp: (libc)String/Array Comparison. |
| * strcoll: (libc)Collation Functions. |
| * strcpy: (libc)Copying and Concatenation. |
| * strcspn: (libc)Search Functions. |
| * strdup: (libc)Copying and Concatenation. |
| * strdupa: (libc)Copying and Concatenation. |
| * strerror: (libc)Error Messages. |
| * strerror_r: (libc)Error Messages. |
| * strfmon: (libc)Formatting Numbers. |
| * strfry: (libc)strfry. |
| * strftime: (libc)Formatting Calendar Time. |
| * strlen: (libc)String Length. |
| * strncasecmp: (libc)String/Array Comparison. |
| * strncat: (libc)Copying and Concatenation. |
| * strncmp: (libc)String/Array Comparison. |
| * strncpy: (libc)Copying and Concatenation. |
| * strndup: (libc)Copying and Concatenation. |
| * strndupa: (libc)Copying and Concatenation. |
| * strnlen: (libc)String Length. |
| * strpbrk: (libc)Search Functions. |
| * strptime: (libc)Low-Level Time String Parsing. |
| * strrchr: (libc)Search Functions. |
| * strsep: (libc)Finding Tokens in a String. |
| * strsignal: (libc)Signal Messages. |
| * strspn: (libc)Search Functions. |
| * strstr: (libc)Search Functions. |
| * strtod: (libc)Parsing of Floats. |
| * strtof: (libc)Parsing of Floats. |
| * strtoimax: (libc)Parsing of Integers. |
| * strtok: (libc)Finding Tokens in a String. |
| * strtok_r: (libc)Finding Tokens in a String. |
| * strtol: (libc)Parsing of Integers. |
| * strtold: (libc)Parsing of Floats. |
| * strtoll: (libc)Parsing of Integers. |
| * strtoq: (libc)Parsing of Integers. |
| * strtoul: (libc)Parsing of Integers. |
| * strtoull: (libc)Parsing of Integers. |
| * strtoumax: (libc)Parsing of Integers. |
| * strtouq: (libc)Parsing of Integers. |
| * strverscmp: (libc)String/Array Comparison. |
| * strxfrm: (libc)Collation Functions. |
| * stty: (libc)BSD Terminal Modes. |
| * swapcontext: (libc)System V contexts. |
| * swprintf: (libc)Formatted Output Functions. |
| * swscanf: (libc)Formatted Input Functions. |
| * symlink: (libc)Symbolic Links. |
| * sync: (libc)Synchronizing I/O. |
| * syscall: (libc)System Calls. |
| * sysconf: (libc)Sysconf Definition. |
| * sysctl: (libc)System Parameters. |
| * syslog: (libc)syslog; vsyslog. |
| * system: (libc)Running a Command. |
| * sysv_signal: (libc)Basic Signal Handling. |
| * tan: (libc)Trig Functions. |
| * tanf: (libc)Trig Functions. |
| * tanh: (libc)Hyperbolic Functions. |
| * tanhf: (libc)Hyperbolic Functions. |
| * tanhl: (libc)Hyperbolic Functions. |
| * tanl: (libc)Trig Functions. |
| * tcdrain: (libc)Line Control. |
| * tcflow: (libc)Line Control. |
| * tcflush: (libc)Line Control. |
| * tcgetattr: (libc)Mode Functions. |
| * tcgetpgrp: (libc)Terminal Access Functions. |
| * tcgetsid: (libc)Terminal Access Functions. |
| * tcsendbreak: (libc)Line Control. |
| * tcsetattr: (libc)Mode Functions. |
| * tcsetpgrp: (libc)Terminal Access Functions. |
| * tdelete: (libc)Tree Search Function. |
| * tdestroy: (libc)Tree Search Function. |
| * telldir: (libc)Random Access Directory. |
| * tempnam: (libc)Temporary Files. |
| * textdomain: (libc)Locating gettext catalog. |
| * tfind: (libc)Tree Search Function. |
| * tgamma: (libc)Special Functions. |
| * tgammaf: (libc)Special Functions. |
| * tgammal: (libc)Special Functions. |
| * time: (libc)Simple Calendar Time. |
| * timegm: (libc)Broken-down Time. |
| * timelocal: (libc)Broken-down Time. |
| * times: (libc)Processor Time. |
| * tmpfile64: (libc)Temporary Files. |
| * tmpfile: (libc)Temporary Files. |
| * tmpnam: (libc)Temporary Files. |
| * tmpnam_r: (libc)Temporary Files. |
| * toascii: (libc)Case Conversion. |
| * tolower: (libc)Case Conversion. |
| * toupper: (libc)Case Conversion. |
| * towctrans: (libc)Wide Character Case Conversion. |
| * towlower: (libc)Wide Character Case Conversion. |
| * towupper: (libc)Wide Character Case Conversion. |
| * trunc: (libc)Rounding Functions. |
| * truncate64: (libc)File Size. |
| * truncate: (libc)File Size. |
| * truncf: (libc)Rounding Functions. |
| * truncl: (libc)Rounding Functions. |
| * tsearch: (libc)Tree Search Function. |
| * ttyname: (libc)Is It a Terminal. |
| * ttyname_r: (libc)Is It a Terminal. |
| * twalk: (libc)Tree Search Function. |
| * tzset: (libc)Time Zone Functions. |
| * ulimit: (libc)Limits on Resources. |
| * umask: (libc)Setting Permissions. |
| * umount2: (libc)Mount-Unmount-Remount. |
| * umount: (libc)Mount-Unmount-Remount. |
| * uname: (libc)Platform Type. |
| * ungetc: (libc)How Unread. |
| * ungetwc: (libc)How Unread. |
| * unlink: (libc)Deleting Files. |
| * unlockpt: (libc)Allocation. |
| * unsetenv: (libc)Environment Access. |
| * updwtmp: (libc)Manipulating the Database. |
| * utime: (libc)File Times. |
| * utimes: (libc)File Times. |
| * utmpname: (libc)Manipulating the Database. |
| * utmpxname: (libc)XPG Functions. |
| * va_arg: (libc)Argument Macros. |
| * va_copy: (libc)Argument Macros. |
| * va_end: (libc)Argument Macros. |
| * va_start: (libc)Argument Macros. |
| * valloc: (libc)Aligned Memory Blocks. |
| * vasprintf: (libc)Variable Arguments Output. |
| * verr: (libc)Error Messages. |
| * verrx: (libc)Error Messages. |
| * versionsort64: (libc)Scanning Directory Content. |
| * versionsort: (libc)Scanning Directory Content. |
| * vfork: (libc)Creating a Process. |
| * vfprintf: (libc)Variable Arguments Output. |
| * vfscanf: (libc)Variable Arguments Input. |
| * vfwprintf: (libc)Variable Arguments Output. |
| * vfwscanf: (libc)Variable Arguments Input. |
| * vlimit: (libc)Limits on Resources. |
| * vprintf: (libc)Variable Arguments Output. |
| * vscanf: (libc)Variable Arguments Input. |
| * vsnprintf: (libc)Variable Arguments Output. |
| * vsprintf: (libc)Variable Arguments Output. |
| * vsscanf: (libc)Variable Arguments Input. |
| * vswprintf: (libc)Variable Arguments Output. |
| * vswscanf: (libc)Variable Arguments Input. |
| * vsyslog: (libc)syslog; vsyslog. |
| * vtimes: (libc)Resource Usage. |
| * vwarn: (libc)Error Messages. |
| * vwarnx: (libc)Error Messages. |
| * vwprintf: (libc)Variable Arguments Output. |
| * vwscanf: (libc)Variable Arguments Input. |
| * wait3: (libc)BSD Wait Functions. |
| * wait4: (libc)Process Completion. |
| * wait: (libc)Process Completion. |
| * waitpid: (libc)Process Completion. |
| * warn: (libc)Error Messages. |
| * warnx: (libc)Error Messages. |
| * wcpcpy: (libc)Copying and Concatenation. |
| * wcpncpy: (libc)Copying and Concatenation. |
| * wcrtomb: (libc)Converting a Character. |
| * wcscasecmp: (libc)String/Array Comparison. |
| * wcscat: (libc)Copying and Concatenation. |
| * wcschr: (libc)Search Functions. |
| * wcschrnul: (libc)Search Functions. |
| * wcscmp: (libc)String/Array Comparison. |
| * wcscoll: (libc)Collation Functions. |
| * wcscpy: (libc)Copying and Concatenation. |
| * wcscspn: (libc)Search Functions. |
| * wcsdup: (libc)Copying and Concatenation. |
| * wcsftime: (libc)Formatting Calendar Time. |
| * wcslen: (libc)String Length. |
| * wcsncasecmp: (libc)String/Array Comparison. |
| * wcsncat: (libc)Copying and Concatenation. |
| * wcsncmp: (libc)String/Array Comparison. |
| * wcsncpy: (libc)Copying and Concatenation. |
| * wcsnlen: (libc)String Length. |
| * wcsnrtombs: (libc)Converting Strings. |
| * wcspbrk: (libc)Search Functions. |
| * wcsrchr: (libc)Search Functions. |
| * wcsrtombs: (libc)Converting Strings. |
| * wcsspn: (libc)Search Functions. |
| * wcsstr: (libc)Search Functions. |
| * wcstod: (libc)Parsing of Floats. |
| * wcstof: (libc)Parsing of Floats. |
| * wcstoimax: (libc)Parsing of Integers. |
| * wcstok: (libc)Finding Tokens in a String. |
| * wcstol: (libc)Parsing of Integers. |
| * wcstold: (libc)Parsing of Floats. |
| * wcstoll: (libc)Parsing of Integers. |
| * wcstombs: (libc)Non-reentrant String Conversion. |
| * wcstoq: (libc)Parsing of Integers. |
| * wcstoul: (libc)Parsing of Integers. |
| * wcstoull: (libc)Parsing of Integers. |
| * wcstoumax: (libc)Parsing of Integers. |
| * wcstouq: (libc)Parsing of Integers. |
| * wcswcs: (libc)Search Functions. |
| * wcsxfrm: (libc)Collation Functions. |
| * wctob: (libc)Converting a Character. |
| * wctomb: (libc)Non-reentrant Character Conversion. |
| * wctrans: (libc)Wide Character Case Conversion. |
| * wctype: (libc)Classification of Wide Characters. |
| * wmemchr: (libc)Search Functions. |
| * wmemcmp: (libc)String/Array Comparison. |
| * wmemcpy: (libc)Copying and Concatenation. |
| * wmemmove: (libc)Copying and Concatenation. |
| * wmempcpy: (libc)Copying and Concatenation. |
| * wmemset: (libc)Copying and Concatenation. |
| * wordexp: (libc)Calling Wordexp. |
| * wordfree: (libc)Calling Wordexp. |
| * wprintf: (libc)Formatted Output Functions. |
| * write: (libc)I/O Primitives. |
| * writev: (libc)Scatter-Gather. |
| * wscanf: (libc)Formatted Input Functions. |
| * y0: (libc)Special Functions. |
| * y0f: (libc)Special Functions. |
| * y0l: (libc)Special Functions. |
| * y1: (libc)Special Functions. |
| * y1f: (libc)Special Functions. |
| * y1l: (libc)Special Functions. |
| * yn: (libc)Special Functions. |
| * ynf: (libc)Special Functions. |
| * ynl: (libc)Special Functions. |
| END-INFO-DIR-ENTRY |
| |
| |
| File: libc.info, Node: Obstack Functions, Next: Growing Objects, Prev: Freeing Obstack Objects, Up: Obstacks |
| |
| 3.2.4.5 Obstack Functions and Macros |
| .................................... |
| |
| The interfaces for using obstacks may be defined either as functions or |
| as macros, depending on the compiler. The obstack facility works with |
| all C compilers, including both ISO C and traditional C, but there are |
| precautions you must take if you plan to use compilers other than GNU C. |
| |
| If you are using an old-fashioned non-ISO C compiler, all the obstack |
| "functions" are actually defined only as macros. You can call these |
| macros like functions, but you cannot use them in any other way (for |
| example, you cannot take their address). |
| |
| Calling the macros requires a special precaution: namely, the first |
| operand (the obstack pointer) may not contain any side effects, because |
| it may be computed more than once. For example, if you write this: |
| |
| obstack_alloc (get_obstack (), 4); |
| |
| you will find that 'get_obstack' may be called several times. If you |
| use '*obstack_list_ptr++' as the obstack pointer argument, you will get |
| very strange results since the incrementation may occur several times. |
| |
| In ISO C, each function has both a macro definition and a function |
| definition. The function definition is used if you take the address of |
| the function without calling it. An ordinary call uses the macro |
| definition by default, but you can request the function definition |
| instead by writing the function name in parentheses, as shown here: |
| |
| char *x; |
| void *(*funcp) (); |
| /* Use the macro. */ |
| x = (char *) obstack_alloc (obptr, size); |
| /* Call the function. */ |
| x = (char *) (obstack_alloc) (obptr, size); |
| /* Take the address of the function. */ |
| funcp = obstack_alloc; |
| |
| This is the same situation that exists in ISO C for the standard library |
| functions. *Note Macro Definitions::. |
| |
| *Warning:* When you do use the macros, you must observe the |
| precaution of avoiding side effects in the first operand, even in ISO C. |
| |
| If you use the GNU C compiler, this precaution is not necessary, |
| because various language extensions in GNU C permit defining the macros |
| so as to compute each argument only once. |
| |
| |
| File: libc.info, Node: Growing Objects, Next: Extra Fast Growing, Prev: Obstack Functions, Up: Obstacks |
| |
| 3.2.4.6 Growing Objects |
| ....................... |
| |
| Because memory in obstack chunks is used sequentially, it is possible to |
| build up an object step by step, adding one or more bytes at a time to |
| the end of the object. With this technique, you do not need to know how |
| much data you will put in the object until you come to the end of it. |
| We call this the technique of "growing objects". The special functions |
| for adding data to the growing object are described in this section. |
| |
| You don't need to do anything special when you start to grow an |
| object. Using one of the functions to add data to the object |
| automatically starts it. However, it is necessary to say explicitly |
| when the object is finished. This is done with the function |
| 'obstack_finish'. |
| |
| The actual address of the object thus built up is not known until the |
| object is finished. Until then, it always remains possible that you |
| will add so much data that the object must be copied into a new chunk. |
| |
| While the obstack is in use for a growing object, you cannot use it |
| for ordinary allocation of another object. If you try to do so, the |
| space already added to the growing object will become part of the other |
| object. |
| |
| -- Function: void obstack_blank (struct obstack *OBSTACK-PTR, int SIZE) |
| Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Unsafe |
| corrupt mem | *Note POSIX Safety Concepts::. |
| |
| The most basic function for adding to a growing object is |
| 'obstack_blank', which adds space without initializing it. |
| |
| -- Function: void obstack_grow (struct obstack *OBSTACK-PTR, void |
| *DATA, int SIZE) |
| Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Unsafe |
| corrupt mem | *Note POSIX Safety Concepts::. |
| |
| To add a block of initialized space, use 'obstack_grow', which is |
| the growing-object analogue of 'obstack_copy'. It adds SIZE bytes |
| of data to the growing object, copying the contents from DATA. |
| |
| -- Function: void obstack_grow0 (struct obstack *OBSTACK-PTR, void |
| *DATA, int SIZE) |
| Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Unsafe |
| corrupt mem | *Note POSIX Safety Concepts::. |
| |
| This is the growing-object analogue of 'obstack_copy0'. It adds |
| SIZE bytes copied from DATA, followed by an additional null |
| character. |
| |
| -- Function: void obstack_1grow (struct obstack *OBSTACK-PTR, char C) |
| Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Unsafe |
| corrupt mem | *Note POSIX Safety Concepts::. |
| |
| To add one character at a time, use the function 'obstack_1grow'. |
| It adds a single byte containing C to the growing object. |
| |
| -- Function: void obstack_ptr_grow (struct obstack *OBSTACK-PTR, void |
| *DATA) |
| Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Unsafe |
| corrupt mem | *Note POSIX Safety Concepts::. |
| |
| Adding the value of a pointer one can use the function |
| 'obstack_ptr_grow'. It adds 'sizeof (void *)' bytes containing the |
| value of DATA. |
| |
| -- Function: void obstack_int_grow (struct obstack *OBSTACK-PTR, int |
| DATA) |
| Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Unsafe |
| corrupt mem | *Note POSIX Safety Concepts::. |
| |
| A single value of type 'int' can be added by using the |
| 'obstack_int_grow' function. It adds 'sizeof (int)' bytes to the |
| growing object and initializes them with the value of DATA. |
| |
| -- Function: void * obstack_finish (struct obstack *OBSTACK-PTR) |
| Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Unsafe |
| corrupt | *Note POSIX Safety Concepts::. |
| |
| When you are finished growing the object, use the function |
| 'obstack_finish' to close it off and return its final address. |
| |
| Once you have finished the object, the obstack is available for |
| ordinary allocation or for growing another object. |
| |
| This function can return a null pointer under the same conditions |
| as 'obstack_alloc' (*note Allocation in an Obstack::). |
| |
| When you build an object by growing it, you will probably need to |
| know afterward how long it became. You need not keep track of this as |
| you grow the object, because you can find out the length from the |
| obstack just before finishing the object with the function |
| 'obstack_object_size', declared as follows: |
| |
| -- Function: int obstack_object_size (struct obstack *OBSTACK-PTR) |
| Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Safe | *Note |
| POSIX Safety Concepts::. |
| |
| This function returns the current size of the growing object, in |
| bytes. Remember to call this function _before_ finishing the |
| object. After it is finished, 'obstack_object_size' will return |
| zero. |
| |
| If you have started growing an object and wish to cancel it, you |
| should finish it and then free it, like this: |
| |
| obstack_free (obstack_ptr, obstack_finish (obstack_ptr)); |
| |
| This has no effect if no object was growing. |
| |
| You can use 'obstack_blank' with a negative size argument to make the |
| current object smaller. Just don't try to shrink it beyond zero |
| length--there's no telling what will happen if you do that. |
| |
| |
| File: libc.info, Node: Extra Fast Growing, Next: Status of an Obstack, Prev: Growing Objects, Up: Obstacks |
| |
| 3.2.4.7 Extra Fast Growing Objects |
| .................................. |
| |
| The usual functions for growing objects incur overhead for checking |
| whether there is room for the new growth in the current chunk. If you |
| are frequently constructing objects in small steps of growth, this |
| overhead can be significant. |
| |
| You can reduce the overhead by using special "fast growth" functions |
| that grow the object without checking. In order to have a robust |
| program, you must do the checking yourself. If you do this checking in |
| the simplest way each time you are about to add data to the object, you |
| have not saved anything, because that is what the ordinary growth |
| functions do. But if you can arrange to check less often, or check more |
| efficiently, then you make the program faster. |
| |
| The function 'obstack_room' returns the amount of room available in |
| the current chunk. It is declared as follows: |
| |
| -- Function: int obstack_room (struct obstack *OBSTACK-PTR) |
| Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Safe | *Note |
| POSIX Safety Concepts::. |
| |
| This returns the number of bytes that can be added safely to the |
| current growing object (or to an object about to be started) in |
| obstack OBSTACK using the fast growth functions. |
| |
| While you know there is room, you can use these fast growth functions |
| for adding data to a growing object: |
| |
| -- Function: void obstack_1grow_fast (struct obstack *OBSTACK-PTR, char |
| C) |
| Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Unsafe |
| corrupt mem | *Note POSIX Safety Concepts::. |
| |
| The function 'obstack_1grow_fast' adds one byte containing the |
| character C to the growing object in obstack OBSTACK-PTR. |
| |
| -- Function: void obstack_ptr_grow_fast (struct obstack *OBSTACK-PTR, |
| void *DATA) |
| Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Safe | *Note |
| POSIX Safety Concepts::. |
| |
| The function 'obstack_ptr_grow_fast' adds 'sizeof (void *)' bytes |
| containing the value of DATA to the growing object in obstack |
| OBSTACK-PTR. |
| |
| -- Function: void obstack_int_grow_fast (struct obstack *OBSTACK-PTR, |
| int DATA) |
| Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Safe | *Note |
| POSIX Safety Concepts::. |
| |
| The function 'obstack_int_grow_fast' adds 'sizeof (int)' bytes |
| containing the value of DATA to the growing object in obstack |
| OBSTACK-PTR. |
| |
| -- Function: void obstack_blank_fast (struct obstack *OBSTACK-PTR, int |
| SIZE) |
| Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Safe | *Note |
| POSIX Safety Concepts::. |
| |
| The function 'obstack_blank_fast' adds SIZE bytes to the growing |
| object in obstack OBSTACK-PTR without initializing them. |
| |
| When you check for space using 'obstack_room' and there is not enough |
| room for what you want to add, the fast growth functions are not safe. |
| In this case, simply use the corresponding ordinary growth function |
| instead. Very soon this will copy the object to a new chunk; then there |
| will be lots of room available again. |
| |
| So, each time you use an ordinary growth function, check afterward |
| for sufficient space using 'obstack_room'. Once the object is copied to |
| a new chunk, there will be plenty of space again, so the program will |
| start using the fast growth functions again. |
| |
| Here is an example: |
| |
| void |
| add_string (struct obstack *obstack, const char *ptr, int len) |
| { |
| while (len > 0) |
| { |
| int room = obstack_room (obstack); |
| if (room == 0) |
| { |
| /* Not enough room. Add one character slowly, |
| which may copy to a new chunk and make room. */ |
| obstack_1grow (obstack, *ptr++); |
| len--; |
| } |
| else |
| { |
| if (room > len) |
| room = len; |
| /* Add fast as much as we have room for. */ |
| len -= room; |
| while (room-- > 0) |
| obstack_1grow_fast (obstack, *ptr++); |
| } |
| } |
| } |
| |
| |
| File: libc.info, Node: Status of an Obstack, Next: Obstacks Data Alignment, Prev: Extra Fast Growing, Up: Obstacks |
| |
| 3.2.4.8 Status of an Obstack |
| ............................ |
| |
| Here are functions that provide information on the current status of |
| allocation in an obstack. You can use them to learn about an object |
| while still growing it. |
| |
| -- Function: void * obstack_base (struct obstack *OBSTACK-PTR) |
| Preliminary: | MT-Safe | AS-Unsafe corrupt | AC-Safe | *Note POSIX |
| Safety Concepts::. |
| |
| This function returns the tentative address of the beginning of the |
| currently growing object in OBSTACK-PTR. If you finish the object |
| immediately, it will have that address. If you make it larger |
| first, it may outgrow the current chunk--then its address will |
| change! |
| |
| If no object is growing, this value says where the next object you |
| allocate will start (once again assuming it fits in the current |
| chunk). |
| |
| -- Function: void * obstack_next_free (struct obstack *OBSTACK-PTR) |
| Preliminary: | MT-Safe | AS-Unsafe corrupt | AC-Safe | *Note POSIX |
| Safety Concepts::. |
| |
| This function returns the address of the first free byte in the |
| current chunk of obstack OBSTACK-PTR. This is the end of the |
| currently growing object. If no object is growing, |
| 'obstack_next_free' returns the same value as 'obstack_base'. |
| |
| -- Function: int obstack_object_size (struct obstack *OBSTACK-PTR) |
| Preliminary: | MT-Safe race:obstack-ptr | AS-Safe | AC-Safe | *Note |
| POSIX Safety Concepts::. |
| |
| This function returns the size in bytes of the currently growing |
| object. This is equivalent to |
| |
| obstack_next_free (OBSTACK-PTR) - obstack_base (OBSTACK-PTR) |
| |
| |
| File: libc.info, Node: Obstacks Data Alignment, Next: Obstack Chunks, Prev: Status of an Obstack, Up: Obstacks |
| |
| 3.2.4.9 Alignment of Data in Obstacks |
| ..................................... |
| |
| Each obstack has an "alignment boundary"; each object allocated in the |
| obstack automatically starts on an address that is a multiple of the |
| specified boundary. By default, this boundary is aligned so that the |
| object can hold any type of data. |
| |
| To access an obstack's alignment boundary, use the macro |
| 'obstack_alignment_mask', whose function prototype looks like this: |
| |
| -- Macro: int obstack_alignment_mask (struct obstack *OBSTACK-PTR) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The value is a bit mask; a bit that is 1 indicates that the |
| corresponding bit in the address of an object should be 0. The |
| mask value should be one less than a power of 2; the effect is that |
| all object addresses are multiples of that power of 2. The default |
| value of the mask is a value that allows aligned objects to hold |
| any type of data: for example, if its value is 3, any type of data |
| can be stored at locations whose addresses are multiples of 4. A |
| mask value of 0 means an object can start on any multiple of 1 |
| (that is, no alignment is required). |
| |
| The expansion of the macro 'obstack_alignment_mask' is an lvalue, |
| so you can alter the mask by assignment. For example, this |
| statement: |
| |
| obstack_alignment_mask (obstack_ptr) = 0; |
| |
| has the effect of turning off alignment processing in the specified |
| obstack. |
| |
| Note that a change in alignment mask does not take effect until |
| _after_ the next time an object is allocated or finished in the obstack. |
| If you are not growing an object, you can make the new alignment mask |
| take effect immediately by calling 'obstack_finish'. This will finish a |
| zero-length object and then do proper alignment for the next object. |
| |
| |
| File: libc.info, Node: Obstack Chunks, Next: Summary of Obstacks, Prev: Obstacks Data Alignment, Up: Obstacks |
| |
| 3.2.4.10 Obstack Chunks |
| ....................... |
| |
| Obstacks work by allocating space for themselves in large chunks, and |
| then parceling out space in the chunks to satisfy your requests. Chunks |
| are normally 4096 bytes long unless you specify a different chunk size. |
| The chunk size includes 8 bytes of overhead that are not actually used |
| for storing objects. Regardless of the specified size, longer chunks |
| will be allocated when necessary for long objects. |
| |
| The obstack library allocates chunks by calling the function |
| 'obstack_chunk_alloc', which you must define. When a chunk is no longer |
| needed because you have freed all the objects in it, the obstack library |
| frees the chunk by calling 'obstack_chunk_free', which you must also |
| define. |
| |
| These two must be defined (as macros) or declared (as functions) in |
| each source file that uses 'obstack_init' (*note Creating Obstacks::). |
| Most often they are defined as macros like this: |
| |
| #define obstack_chunk_alloc malloc |
| #define obstack_chunk_free free |
| |
| Note that these are simple macros (no arguments). Macro definitions |
| with arguments will not work! It is necessary that |
| 'obstack_chunk_alloc' or 'obstack_chunk_free', alone, expand into a |
| function name if it is not itself a function name. |
| |
| If you allocate chunks with 'malloc', the chunk size should be a |
| power of 2. The default chunk size, 4096, was chosen because it is long |
| enough to satisfy many typical requests on the obstack yet short enough |
| not to waste too much memory in the portion of the last chunk not yet |
| used. |
| |
| -- Macro: int obstack_chunk_size (struct obstack *OBSTACK-PTR) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This returns the chunk size of the given obstack. |
| |
| Since this macro expands to an lvalue, you can specify a new chunk |
| size by assigning it a new value. Doing so does not affect the chunks |
| already allocated, but will change the size of chunks allocated for that |
| particular obstack in the future. It is unlikely to be useful to make |
| the chunk size smaller, but making it larger might improve efficiency if |
| you are allocating many objects whose size is comparable to the chunk |
| size. Here is how to do so cleanly: |
| |
| if (obstack_chunk_size (obstack_ptr) < NEW-CHUNK-SIZE) |
| obstack_chunk_size (obstack_ptr) = NEW-CHUNK-SIZE; |
| |
| |
| File: libc.info, Node: Summary of Obstacks, Prev: Obstack Chunks, Up: Obstacks |
| |
| 3.2.4.11 Summary of Obstack Functions |
| ..................................... |
| |
| Here is a summary of all the functions associated with obstacks. Each |
| takes the address of an obstack ('struct obstack *') as its first |
| argument. |
| |
| 'void obstack_init (struct obstack *OBSTACK-PTR)' |
| Initialize use of an obstack. *Note Creating Obstacks::. |
| |
| 'void *obstack_alloc (struct obstack *OBSTACK-PTR, int SIZE)' |
| Allocate an object of SIZE uninitialized bytes. *Note Allocation |
| in an Obstack::. |
| |
| 'void *obstack_copy (struct obstack *OBSTACK-PTR, void *ADDRESS, int SIZE)' |
| Allocate an object of SIZE bytes, with contents copied from |
| ADDRESS. *Note Allocation in an Obstack::. |
| |
| 'void *obstack_copy0 (struct obstack *OBSTACK-PTR, void *ADDRESS, int SIZE)' |
| Allocate an object of SIZE+1 bytes, with SIZE of them copied from |
| ADDRESS, followed by a null character at the end. *Note Allocation |
| in an Obstack::. |
| |
| 'void obstack_free (struct obstack *OBSTACK-PTR, void *OBJECT)' |
| Free OBJECT (and everything allocated in the specified obstack more |
| recently than OBJECT). *Note Freeing Obstack Objects::. |
| |
| 'void obstack_blank (struct obstack *OBSTACK-PTR, int SIZE)' |
| Add SIZE uninitialized bytes to a growing object. *Note Growing |
| Objects::. |
| |
| 'void obstack_grow (struct obstack *OBSTACK-PTR, void *ADDRESS, int SIZE)' |
| Add SIZE bytes, copied from ADDRESS, to a growing object. *Note |
| Growing Objects::. |
| |
| 'void obstack_grow0 (struct obstack *OBSTACK-PTR, void *ADDRESS, int SIZE)' |
| Add SIZE bytes, copied from ADDRESS, to a growing object, and then |
| add another byte containing a null character. *Note Growing |
| Objects::. |
| |
| 'void obstack_1grow (struct obstack *OBSTACK-PTR, char DATA-CHAR)' |
| Add one byte containing DATA-CHAR to a growing object. *Note |
| Growing Objects::. |
| |
| 'void *obstack_finish (struct obstack *OBSTACK-PTR)' |
| Finalize the object that is growing and return its permanent |
| address. *Note Growing Objects::. |
| |
| 'int obstack_object_size (struct obstack *OBSTACK-PTR)' |
| Get the current size of the currently growing object. *Note |
| Growing Objects::. |
| |
| 'void obstack_blank_fast (struct obstack *OBSTACK-PTR, int SIZE)' |
| Add SIZE uninitialized bytes to a growing object without checking |
| that there is enough room. *Note Extra Fast Growing::. |
| |
| 'void obstack_1grow_fast (struct obstack *OBSTACK-PTR, char DATA-CHAR)' |
| Add one byte containing DATA-CHAR to a growing object without |
| checking that there is enough room. *Note Extra Fast Growing::. |
| |
| 'int obstack_room (struct obstack *OBSTACK-PTR)' |
| Get the amount of room now available for growing the current |
| object. *Note Extra Fast Growing::. |
| |
| 'int obstack_alignment_mask (struct obstack *OBSTACK-PTR)' |
| The mask used for aligning the beginning of an object. This is an |
| lvalue. *Note Obstacks Data Alignment::. |
| |
| 'int obstack_chunk_size (struct obstack *OBSTACK-PTR)' |
| The size for allocating chunks. This is an lvalue. *Note Obstack |
| Chunks::. |
| |
| 'void *obstack_base (struct obstack *OBSTACK-PTR)' |
| Tentative starting address of the currently growing object. *Note |
| Status of an Obstack::. |
| |
| 'void *obstack_next_free (struct obstack *OBSTACK-PTR)' |
| Address just after the end of the currently growing object. *Note |
| Status of an Obstack::. |
| |
| |
| File: libc.info, Node: Variable Size Automatic, Prev: Obstacks, Up: Memory Allocation |
| |
| 3.2.5 Automatic Storage with Variable Size |
| ------------------------------------------ |
| |
| The function 'alloca' supports a kind of half-dynamic allocation in |
| which blocks are allocated dynamically but freed automatically. |
| |
| Allocating a block with 'alloca' is an explicit action; you can |
| allocate as many blocks as you wish, and compute the size at run time. |
| But all the blocks are freed when you exit the function that 'alloca' |
| was called from, just as if they were automatic variables declared in |
| that function. There is no way to free the space explicitly. |
| |
| The prototype for 'alloca' is in 'stdlib.h'. This function is a BSD |
| extension. |
| |
| -- Function: void * alloca (size_t SIZE) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The return value of 'alloca' is the address of a block of SIZE |
| bytes of memory, allocated in the stack frame of the calling |
| function. |
| |
| Do not use 'alloca' inside the arguments of a function call--you will |
| get unpredictable results, because the stack space for the 'alloca' |
| would appear on the stack in the middle of the space for the function |
| arguments. An example of what to avoid is 'foo (x, alloca (4), y)'. |
| |
| * Menu: |
| |
| * Alloca Example:: Example of using 'alloca'. |
| * Advantages of Alloca:: Reasons to use 'alloca'. |
| * Disadvantages of Alloca:: Reasons to avoid 'alloca'. |
| * GNU C Variable-Size Arrays:: Only in GNU C, here is an alternative |
| method of allocating dynamically and |
| freeing automatically. |
| |
| |
| File: libc.info, Node: Alloca Example, Next: Advantages of Alloca, Up: Variable Size Automatic |
| |
| 3.2.5.1 'alloca' Example |
| ........................ |
| |
| As an example of the use of 'alloca', here is a function that opens a |
| file name made from concatenating two argument strings, and returns a |
| file descriptor or minus one signifying failure: |
| |
| int |
| open2 (char *str1, char *str2, int flags, int mode) |
| { |
| char *name = (char *) alloca (strlen (str1) + strlen (str2) + 1); |
| stpcpy (stpcpy (name, str1), str2); |
| return open (name, flags, mode); |
| } |
| |
| Here is how you would get the same results with 'malloc' and 'free': |
| |
| int |
| open2 (char *str1, char *str2, int flags, int mode) |
| { |
| char *name = (char *) malloc (strlen (str1) + strlen (str2) + 1); |
| int desc; |
| if (name == 0) |
| fatal ("virtual memory exceeded"); |
| stpcpy (stpcpy (name, str1), str2); |
| desc = open (name, flags, mode); |
| free (name); |
| return desc; |
| } |
| |
| As you can see, it is simpler with 'alloca'. But 'alloca' has other, |
| more important advantages, and some disadvantages. |
| |
| |
| File: libc.info, Node: Advantages of Alloca, Next: Disadvantages of Alloca, Prev: Alloca Example, Up: Variable Size Automatic |
| |
| 3.2.5.2 Advantages of 'alloca' |
| .............................. |
| |
| Here are the reasons why 'alloca' may be preferable to 'malloc': |
| |
| * Using 'alloca' wastes very little space and is very fast. (It is |
| open-coded by the GNU C compiler.) |
| |
| * Since 'alloca' does not have separate pools for different sizes of |
| block, space used for any size block can be reused for any other |
| size. 'alloca' does not cause memory fragmentation. |
| |
| * Nonlocal exits done with 'longjmp' (*note Non-Local Exits::) |
| automatically free the space allocated with 'alloca' when they exit |
| through the function that called 'alloca'. This is the most |
| important reason to use 'alloca'. |
| |
| To illustrate this, suppose you have a function |
| 'open_or_report_error' which returns a descriptor, like 'open', if |
| it succeeds, but does not return to its caller if it fails. If the |
| file cannot be opened, it prints an error message and jumps out to |
| the command level of your program using 'longjmp'. Let's change |
| 'open2' (*note Alloca Example::) to use this subroutine: |
| |
| int |
| open2 (char *str1, char *str2, int flags, int mode) |
| { |
| char *name = (char *) alloca (strlen (str1) + strlen (str2) + 1); |
| stpcpy (stpcpy (name, str1), str2); |
| return open_or_report_error (name, flags, mode); |
| } |
| |
| Because of the way 'alloca' works, the memory it allocates is freed |
| even when an error occurs, with no special effort required. |
| |
| By contrast, the previous definition of 'open2' (which uses |
| 'malloc' and 'free') would develop a memory leak if it were changed |
| in this way. Even if you are willing to make more changes to fix |
| it, there is no easy way to do so. |
| |
| |
| File: libc.info, Node: Disadvantages of Alloca, Next: GNU C Variable-Size Arrays, Prev: Advantages of Alloca, Up: Variable Size Automatic |
| |
| 3.2.5.3 Disadvantages of 'alloca' |
| ................................. |
| |
| These are the disadvantages of 'alloca' in comparison with 'malloc': |
| |
| * If you try to allocate more memory than the machine can provide, |
| you don't get a clean error message. Instead you get a fatal |
| signal like the one you would get from an infinite recursion; |
| probably a segmentation violation (*note Program Error Signals::). |
| |
| * Some non-GNU systems fail to support 'alloca', so it is less |
| portable. However, a slower emulation of 'alloca' written in C is |
| available for use on systems with this deficiency. |
| |
| |
| File: libc.info, Node: GNU C Variable-Size Arrays, Prev: Disadvantages of Alloca, Up: Variable Size Automatic |
| |
| 3.2.5.4 GNU C Variable-Size Arrays |
| .................................. |
| |
| In GNU C, you can replace most uses of 'alloca' with an array of |
| variable size. Here is how 'open2' would look then: |
| |
| int open2 (char *str1, char *str2, int flags, int mode) |
| { |
| char name[strlen (str1) + strlen (str2) + 1]; |
| stpcpy (stpcpy (name, str1), str2); |
| return open (name, flags, mode); |
| } |
| |
| But 'alloca' is not always equivalent to a variable-sized array, for |
| several reasons: |
| |
| * A variable size array's space is freed at the end of the scope of |
| the name of the array. The space allocated with 'alloca' remains |
| until the end of the function. |
| |
| * It is possible to use 'alloca' within a loop, allocating an |
| additional block on each iteration. This is impossible with |
| variable-sized arrays. |
| |
| *NB:* If you mix use of 'alloca' and variable-sized arrays within one |
| function, exiting a scope in which a variable-sized array was declared |
| frees all blocks allocated with 'alloca' during the execution of that |
| scope. |
| |
| |
| File: libc.info, Node: Resizing the Data Segment, Next: Locking Pages, Prev: Memory Allocation, Up: Memory |
| |
| 3.3 Resizing the Data Segment |
| ============================= |
| |
| The symbols in this section are declared in 'unistd.h'. |
| |
| You will not normally use the functions in this section, because the |
| functions described in *note Memory Allocation:: are easier to use. |
| Those are interfaces to a GNU C Library memory allocator that uses the |
| functions below itself. The functions below are simple interfaces to |
| system calls. |
| |
| -- Function: int brk (void *ADDR) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| 'brk' sets the high end of the calling process' data segment to |
| ADDR. |
| |
| The address of the end of a segment is defined to be the address of |
| the last byte in the segment plus 1. |
| |
| The function has no effect if ADDR is lower than the low end of the |
| data segment. (This is considered success, by the way). |
| |
| The function fails if it would cause the data segment to overlap |
| another segment or exceed the process' data storage limit (*note |
| Limits on Resources::). |
| |
| The function is named for a common historical case where data |
| storage and the stack are in the same segment. Data storage |
| allocation grows upward from the bottom of the segment while the |
| stack grows downward toward it from the top of the segment and the |
| curtain between them is called the "break". |
| |
| The return value is zero on success. On failure, the return value |
| is '-1' and 'errno' is set accordingly. The following 'errno' |
| values are specific to this function: |
| |
| 'ENOMEM' |
| The request would cause the data segment to overlap another |
| segment or exceed the process' data storage limit. |
| |
| -- Function: void *sbrk (ptrdiff_t DELTA) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This function is the same as 'brk' except that you specify the new |
| end of the data segment as an offset DELTA from the current end and |
| on success the return value is the address of the resulting end of |
| the data segment instead of zero. |
| |
| This means you can use 'sbrk(0)' to find out what the current end |
| of the data segment is. |
| |
| |
| File: libc.info, Node: Locking Pages, Prev: Resizing the Data Segment, Up: Memory |
| |
| 3.4 Locking Pages |
| ================= |
| |
| You can tell the system to associate a particular virtual memory page |
| with a real page frame and keep it that way -- i.e., cause the page to |
| be paged in if it isn't already and mark it so it will never be paged |
| out and consequently will never cause a page fault. This is called |
| "locking" a page. |
| |
| The functions in this chapter lock and unlock the calling process' |
| pages. |
| |
| * Menu: |
| |
| * Why Lock Pages:: Reasons to read this section. |
| * Locked Memory Details:: Everything you need to know locked |
| memory |
| * Page Lock Functions:: Here's how to do it. |
| |
| |
| File: libc.info, Node: Why Lock Pages, Next: Locked Memory Details, Up: Locking Pages |
| |
| 3.4.1 Why Lock Pages |
| -------------------- |
| |
| Because page faults cause paged out pages to be paged in transparently, |
| a process rarely needs to be concerned about locking pages. However, |
| there are two reasons people sometimes are: |
| |
| * Speed. A page fault is transparent only insofar as the process is |
| not sensitive to how long it takes to do a simple memory access. |
| Time-critical processes, especially realtime processes, may not be |
| able to wait or may not be able to tolerate variance in execution |
| speed. |
| |
| A process that needs to lock pages for this reason probably also |
| needs priority among other processes for use of the CPU. *Note |
| Priority::. |
| |
| In some cases, the programmer knows better than the system's demand |
| paging allocator which pages should remain in real memory to |
| optimize system performance. In this case, locking pages can help. |
| |
| * Privacy. If you keep secrets in virtual memory and that virtual |
| memory gets paged out, that increases the chance that the secrets |
| will get out. If a password gets written out to disk swap space, |
| for example, it might still be there long after virtual and real |
| memory have been wiped clean. |
| |
| Be aware that when you lock a page, that's one fewer page frame that |
| can be used to back other virtual memory (by the same or other |
| processes), which can mean more page faults, which means the system runs |
| more slowly. In fact, if you lock enough memory, some programs may not |
| be able to run at all for lack of real memory. |
| |
| |
| File: libc.info, Node: Locked Memory Details, Next: Page Lock Functions, Prev: Why Lock Pages, Up: Locking Pages |
| |
| 3.4.2 Locked Memory Details |
| --------------------------- |
| |
| A memory lock is associated with a virtual page, not a real frame. The |
| paging rule is: If a frame backs at least one locked page, don't page it |
| out. |
| |
| Memory locks do not stack. I.e., you can't lock a particular page |
| twice so that it has to be unlocked twice before it is truly unlocked. |
| It is either locked or it isn't. |
| |
| A memory lock persists until the process that owns the memory |
| explicitly unlocks it. (But process termination and exec cause the |
| virtual memory to cease to exist, which you might say means it isn't |
| locked any more). |
| |
| Memory locks are not inherited by child processes. (But note that on |
| a modern Unix system, immediately after a fork, the parent's and the |
| child's virtual address space are backed by the same real page frames, |
| so the child enjoys the parent's locks). *Note Creating a Process::. |
| |
| Because of its ability to impact other processes, only the superuser |
| can lock a page. Any process can unlock its own page. |
| |
| The system sets limits on the amount of memory a process can have |
| locked and the amount of real memory it can have dedicated to it. *Note |
| Limits on Resources::. |
| |
| In Linux, locked pages aren't as locked as you might think. Two |
| virtual pages that are not shared memory can nonetheless be backed by |
| the same real frame. The kernel does this in the name of efficiency |
| when it knows both virtual pages contain identical data, and does it |
| even if one or both of the virtual pages are locked. |
| |
| But when a process modifies one of those pages, the kernel must get |
| it a separate frame and fill it with the page's data. This is known as |
| a "copy-on-write page fault". It takes a small amount of time and in a |
| pathological case, getting that frame may require I/O. |
| |
| To make sure this doesn't happen to your program, don't just lock the |
| pages. Write to them as well, unless you know you won't write to them |
| ever. And to make sure you have pre-allocated frames for your stack, |
| enter a scope that declares a C automatic variable larger than the |
| maximum stack size you will need, set it to something, then return from |
| its scope. |
| |
| |
| File: libc.info, Node: Page Lock Functions, Prev: Locked Memory Details, Up: Locking Pages |
| |
| 3.4.3 Functions To Lock And Unlock Pages |
| ---------------------------------------- |
| |
| The symbols in this section are declared in 'sys/mman.h'. These |
| functions are defined by POSIX.1b, but their availability depends on |
| your kernel. If your kernel doesn't allow these functions, they exist |
| but always fail. They _are_ available with a Linux kernel. |
| |
| *Portability Note:* POSIX.1b requires that when the 'mlock' and |
| 'munlock' functions are available, the file 'unistd.h' define the macro |
| '_POSIX_MEMLOCK_RANGE' and the file 'limits.h' define the macro |
| 'PAGESIZE' to be the size of a memory page in bytes. It requires that |
| when the 'mlockall' and 'munlockall' functions are available, the |
| 'unistd.h' file define the macro '_POSIX_MEMLOCK'. The GNU C Library |
| conforms to this requirement. |
| |
| -- Function: int mlock (const void *ADDR, size_t LEN) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| 'mlock' locks a range of the calling process' virtual pages. |
| |
| The range of memory starts at address ADDR and is LEN bytes long. |
| Actually, since you must lock whole pages, it is the range of pages |
| that include any part of the specified range. |
| |
| When the function returns successfully, each of those pages is |
| backed by (connected to) a real frame (is resident) and is marked |
| to stay that way. This means the function may cause page-ins and |
| have to wait for them. |
| |
| When the function fails, it does not affect the lock status of any |
| pages. |
| |
| The return value is zero if the function succeeds. Otherwise, it |
| is '-1' and 'errno' is set accordingly. 'errno' values specific to |
| this function are: |
| |
| 'ENOMEM' |
| * At least some of the specified address range does not |
| exist in the calling process' virtual address space. |
| * The locking would cause the process to exceed its locked |
| page limit. |
| |
| 'EPERM' |
| The calling process is not superuser. |
| |
| 'EINVAL' |
| LEN is not positive. |
| |
| 'ENOSYS' |
| The kernel does not provide 'mlock' capability. |
| |
| You can lock _all_ a process' memory with 'mlockall'. You unlock |
| memory with 'munlock' or 'munlockall'. |
| |
| To avoid all page faults in a C program, you have to use |
| 'mlockall', because some of the memory a program uses is hidden |
| from the C code, e.g. the stack and automatic variables, and you |
| wouldn't know what address to tell 'mlock'. |
| |
| -- Function: int munlock (const void *ADDR, size_t LEN) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| 'munlock' unlocks a range of the calling process' virtual pages. |
| |
| 'munlock' is the inverse of 'mlock' and functions completely |
| analogously to 'mlock', except that there is no 'EPERM' failure. |
| |
| -- Function: int mlockall (int FLAGS) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| 'mlockall' locks all the pages in a process' virtual memory address |
| space, and/or any that are added to it in the future. This |
| includes the pages of the code, data and stack segment, as well as |
| shared libraries, user space kernel data, shared memory, and memory |
| mapped files. |
| |
| FLAGS is a string of single bit flags represented by the following |
| macros. They tell 'mlockall' which of its functions you want. All |
| other bits must be zero. |
| |
| 'MCL_CURRENT' |
| Lock all pages which currently exist in the calling process' |
| virtual address space. |
| |
| 'MCL_FUTURE' |
| Set a mode such that any pages added to the process' virtual |
| address space in the future will be locked from birth. This |
| mode does not affect future address spaces owned by the same |
| process so exec, which replaces a process' address space, |
| wipes out 'MCL_FUTURE'. *Note Executing a File::. |
| |
| When the function returns successfully, and you specified |
| 'MCL_CURRENT', all of the process' pages are backed by (connected |
| to) real frames (they are resident) and are marked to stay that |
| way. This means the function may cause page-ins and have to wait |
| for them. |
| |
| When the process is in 'MCL_FUTURE' mode because it successfully |
| executed this function and specified 'MCL_CURRENT', any system call |
| by the process that requires space be added to its virtual address |
| space fails with 'errno' = 'ENOMEM' if locking the additional space |
| would cause the process to exceed its locked page limit. In the |
| case that the address space addition that can't be accommodated is |
| stack expansion, the stack expansion fails and the kernel sends a |
| 'SIGSEGV' signal to the process. |
| |
| When the function fails, it does not affect the lock status of any |
| pages or the future locking mode. |
| |
| The return value is zero if the function succeeds. Otherwise, it |
| is '-1' and 'errno' is set accordingly. 'errno' values specific to |
| this function are: |
| |
| 'ENOMEM' |
| * At least some of the specified address range does not |
| exist in the calling process' virtual address space. |
| * The locking would cause the process to exceed its locked |
| page limit. |
| |
| 'EPERM' |
| The calling process is not superuser. |
| |
| 'EINVAL' |
| Undefined bits in FLAGS are not zero. |
| |
| 'ENOSYS' |
| The kernel does not provide 'mlockall' capability. |
| |
| You can lock just specific pages with 'mlock'. You unlock pages |
| with 'munlockall' and 'munlock'. |
| |
| -- Function: int munlockall (void) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| 'munlockall' unlocks every page in the calling process' virtual |
| address space and turn off 'MCL_FUTURE' future locking mode. |
| |
| The return value is zero if the function succeeds. Otherwise, it |
| is '-1' and 'errno' is set accordingly. The only way this function |
| can fail is for generic reasons that all functions and system calls |
| can fail, so there are no specific 'errno' values. |
| |
| |
| File: libc.info, Node: Character Handling, Next: String and Array Utilities, Prev: Memory, Up: Top |
| |
| 4 Character Handling |
| ******************** |
| |
| Programs that work with characters and strings often need to classify a |
| character--is it alphabetic, is it a digit, is it whitespace, and so |
| on--and perform case conversion operations on characters. The functions |
| in the header file 'ctype.h' are provided for this purpose. |
| |
| Since the choice of locale and character set can alter the |
| classifications of particular character codes, all of these functions |
| are affected by the current locale. (More precisely, they are affected |
| by the locale currently selected for character classification--the |
| 'LC_CTYPE' category; see *note Locale Categories::.) |
| |
| The ISO C standard specifies two different sets of functions. The |
| one set works on 'char' type characters, the other one on 'wchar_t' wide |
| characters (*note Extended Char Intro::). |
| |
| * Menu: |
| |
| * Classification of Characters:: Testing whether characters are |
| letters, digits, punctuation, etc. |
| |
| * Case Conversion:: Case mapping, and the like. |
| * Classification of Wide Characters:: Character class determination for |
| wide characters. |
| * Using Wide Char Classes:: Notes on using the wide character |
| classes. |
| * Wide Character Case Conversion:: Mapping of wide characters. |
| |
| |
| File: libc.info, Node: Classification of Characters, Next: Case Conversion, Up: Character Handling |
| |
| 4.1 Classification of Characters |
| ================================ |
| |
| This section explains the library functions for classifying characters. |
| For example, 'isalpha' is the function to test for an alphabetic |
| character. It takes one argument, the character to test, and returns a |
| nonzero integer if the character is alphabetic, and zero otherwise. You |
| would use it like this: |
| |
| if (isalpha (c)) |
| printf ("The character `%c' is alphabetic.\n", c); |
| |
| Each of the functions in this section tests for membership in a |
| particular class of characters; each has a name starting with 'is'. |
| Each of them takes one argument, which is a character to test, and |
| returns an 'int' which is treated as a boolean value. The character |
| argument is passed as an 'int', and it may be the constant value 'EOF' |
| instead of a real character. |
| |
| The attributes of any given character can vary between locales. |
| *Note Locales::, for more information on locales. |
| |
| These functions are declared in the header file 'ctype.h'. |
| |
| -- Function: int islower (int C) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| Returns true if C is a lower-case letter. The letter need not be |
| from the Latin alphabet, any alphabet representable is valid. |
| |
| -- Function: int isupper (int C) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| Returns true if C is an upper-case letter. The letter need not be |
| from the Latin alphabet, any alphabet representable is valid. |
| |
| -- Function: int isalpha (int C) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| Returns true if C is an alphabetic character (a letter). If |
| 'islower' or 'isupper' is true of a character, then 'isalpha' is |
| also true. |
| |
| In some locales, there may be additional characters for which |
| 'isalpha' is true--letters which are neither upper case nor lower |
| case. But in the standard '"C"' locale, there are no such |
| additional characters. |
| |
| -- Function: int isdigit (int C) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| Returns true if C is a decimal digit ('0' through '9'). |
| |
| -- Function: int isalnum (int C) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| Returns true if C is an alphanumeric character (a letter or |
| number); in other words, if either 'isalpha' or 'isdigit' is true |
| of a character, then 'isalnum' is also true. |
| |
| -- Function: int isxdigit (int C) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| Returns true if C is a hexadecimal digit. Hexadecimal digits |
| include the normal decimal digits '0' through '9' and the letters |
| 'A' through 'F' and 'a' through 'f'. |
| |
| -- Function: int ispunct (int C) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| Returns true if C is a punctuation character. This means any |
| printing character that is not alphanumeric or a space character. |
| |
| -- Function: int isspace (int C) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| Returns true if C is a "whitespace" character. In the standard |
| '"C"' locale, 'isspace' returns true for only the standard |
| whitespace characters: |
| |
| '' '' |
| space |
| |
| ''\f'' |
| formfeed |
| |
| ''\n'' |
| newline |
| |
| ''\r'' |
| carriage return |
| |
| ''\t'' |
| horizontal tab |
| |
| ''\v'' |
| vertical tab |
| |
| -- Function: int isblank (int C) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| Returns true if C is a blank character; that is, a space or a tab. |
| This function was originally a GNU extension, but was added in |
| ISO C99. |
| |
| -- Function: int isgraph (int C) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| Returns true if C is a graphic character; that is, a character that |
| has a glyph associated with it. The whitespace characters are not |
| considered graphic. |
| |
| -- Function: int isprint (int C) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| Returns true if C is a printing character. Printing characters |
| include all the graphic characters, plus the space (' ') character. |
| |
| -- Function: int iscntrl (int C) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| Returns true if C is a control character (that is, a character that |
| is not a printing character). |
| |
| -- Function: int isascii (int C) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| Returns true if C is a 7-bit 'unsigned char' value that fits into |
| the US/UK ASCII character set. This function is a BSD extension |
| and is also an SVID extension. |
| |
| |
| File: libc.info, Node: Case Conversion, Next: Classification of Wide Characters, Prev: Classification of Characters, Up: Character Handling |
| |
| 4.2 Case Conversion |
| =================== |
| |
| This section explains the library functions for performing conversions |
| such as case mappings on characters. For example, 'toupper' converts |
| any character to upper case if possible. If the character can't be |
| converted, 'toupper' returns it unchanged. |
| |
| These functions take one argument of type 'int', which is the |
| character to convert, and return the converted character as an 'int'. |
| If the conversion is not applicable to the argument given, the argument |
| is returned unchanged. |
| |
| *Compatibility Note:* In pre-ISO C dialects, instead of returning the |
| argument unchanged, these functions may fail when the argument is not |
| suitable for the conversion. Thus for portability, you may need to |
| write 'islower(c) ? toupper(c) : c' rather than just 'toupper(c)'. |
| |
| These functions are declared in the header file 'ctype.h'. |
| |
| -- Function: int tolower (int C) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| If C is an upper-case letter, 'tolower' returns the corresponding |
| lower-case letter. If C is not an upper-case letter, C is returned |
| unchanged. |
| |
| -- Function: int toupper (int C) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| If C is a lower-case letter, 'toupper' returns the corresponding |
| upper-case letter. Otherwise C is returned unchanged. |
| |
| -- Function: int toascii (int C) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This function converts C to a 7-bit 'unsigned char' value that fits |
| into the US/UK ASCII character set, by clearing the high-order |
| bits. This function is a BSD extension and is also an SVID |
| extension. |
| |
| -- Function: int _tolower (int C) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This is identical to 'tolower', and is provided for compatibility |
| with the SVID. *Note SVID::. |
| |
| -- Function: int _toupper (int C) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This is identical to 'toupper', and is provided for compatibility |
| with the SVID. |
| |
| |
| File: libc.info, Node: Classification of Wide Characters, Next: Using Wide Char Classes, Prev: Case Conversion, Up: Character Handling |
| |
| 4.3 Character class determination for wide characters |
| ===================================================== |
| |
| Amendment 1 to ISO C90 defines functions to classify wide characters. |
| Although the original ISO C90 standard already defined the type |
| 'wchar_t', no functions operating on them were defined. |
| |
| The general design of the classification functions for wide |
| characters is more general. It allows extensions to the set of |
| available classifications, beyond those which are always available. The |
| POSIX standard specifies how extensions can be made, and this is already |
| implemented in the GNU C Library implementation of the 'localedef' |
| program. |
| |
| The character class functions are normally implemented with bitsets, |
| with a bitset per character. For a given character, the appropriate |
| bitset is read from a table and a test is performed as to whether a |
| certain bit is set. Which bit is tested for is determined by the class. |
| |
| For the wide character classification functions this is made visible. |
| There is a type classification type defined, a function to retrieve this |
| value for a given class, and a function to test whether a given |
| character is in this class, using the classification value. On top of |
| this the normal character classification functions as used for 'char' |
| objects can be defined. |
| |
| -- Data type: wctype_t |
| The 'wctype_t' can hold a value which represents a character class. |
| The only defined way to generate such a value is by using the |
| 'wctype' function. |
| |
| This type is defined in 'wctype.h'. |
| |
| -- Function: wctype_t wctype (const char *PROPERTY) |
| Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX |
| Safety Concepts::. |
| |
| The 'wctype' returns a value representing a class of wide |
| characters which is identified by the string PROPERTY. Beside some |
| standard properties each locale can define its own ones. In case |
| no property with the given name is known for the current locale |
| selected for the 'LC_CTYPE' category, the function returns zero. |
| |
| The properties known in every locale are: |
| |
| '"alnum"' '"alpha"' '"cntrl"' '"digit"' |
| '"graph"' '"lower"' '"print"' '"punct"' |
| '"space"' '"upper"' '"xdigit"' |
| |
| This function is declared in 'wctype.h'. |
| |
| To test the membership of a character to one of the non-standard |
| classes the ISO C standard defines a completely new function. |
| |
| -- Function: int iswctype (wint_t WC, wctype_t DESC) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This function returns a nonzero value if WC is in the character |
| class specified by DESC. DESC must previously be returned by a |
| successful call to 'wctype'. |
| |
| This function is declared in 'wctype.h'. |
| |
| To make it easier to use the commonly-used classification functions, |
| they are defined in the C library. There is no need to use 'wctype' if |
| the property string is one of the known character classes. In some |
| situations it is desirable to construct the property strings, and then |
| it is important that 'wctype' can also handle the standard classes. |
| |
| -- Function: int iswalnum (wint_t WC) |
| Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX |
| Safety Concepts::. |
| |
| This function returns a nonzero value if WC is an alphanumeric |
| character (a letter or number); in other words, if either |
| 'iswalpha' or 'iswdigit' is true of a character, then 'iswalnum' is |
| also true. |
| |
| This function can be implemented using |
| |
| iswctype (wc, wctype ("alnum")) |
| |
| It is declared in 'wctype.h'. |
| |
| -- Function: int iswalpha (wint_t WC) |
| Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX |
| Safety Concepts::. |
| |
| Returns true if WC is an alphabetic character (a letter). If |
| 'iswlower' or 'iswupper' is true of a character, then 'iswalpha' is |
| also true. |
| |
| In some locales, there may be additional characters for which |
| 'iswalpha' is true--letters which are neither upper case nor lower |
| case. But in the standard '"C"' locale, there are no such |
| additional characters. |
| |
| This function can be implemented using |
| |
| iswctype (wc, wctype ("alpha")) |
| |
| It is declared in 'wctype.h'. |
| |
| -- Function: int iswcntrl (wint_t WC) |
| Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX |
| Safety Concepts::. |
| |
| Returns true if WC is a control character (that is, a character |
| that is not a printing character). |
| |
| This function can be implemented using |
| |
| iswctype (wc, wctype ("cntrl")) |
| |
| It is declared in 'wctype.h'. |
| |
| -- Function: int iswdigit (wint_t WC) |
| Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX |
| Safety Concepts::. |
| |
| Returns true if WC is a digit (e.g., '0' through '9'). Please note |
| that this function does not only return a nonzero value for |
| _decimal_ digits, but for all kinds of digits. A consequence is |
| that code like the following will *not* work unconditionally for |
| wide characters: |
| |
| n = 0; |
| while (iswdigit (*wc)) |
| { |
| n *= 10; |
| n += *wc++ - L'0'; |
| } |
| |
| This function can be implemented using |
| |
| iswctype (wc, wctype ("digit")) |
| |
| It is declared in 'wctype.h'. |
| |
| -- Function: int iswgraph (wint_t WC) |
| Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX |
| Safety Concepts::. |
| |
| Returns true if WC is a graphic character; that is, a character |
| that has a glyph associated with it. The whitespace characters are |
| not considered graphic. |
| |
| This function can be implemented using |
| |
| iswctype (wc, wctype ("graph")) |
| |
| It is declared in 'wctype.h'. |
| |
| -- Function: int iswlower (wint_t WC) |
| Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX |
| Safety Concepts::. |
| |
| Returns true if WC is a lower-case letter. The letter need not be |
| from the Latin alphabet, any alphabet representable is valid. |
| |
| This function can be implemented using |
| |
| iswctype (wc, wctype ("lower")) |
| |
| It is declared in 'wctype.h'. |
| |
| -- Function: int iswprint (wint_t WC) |
| Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX |
| Safety Concepts::. |
| |
| Returns true if WC is a printing character. Printing characters |
| include all the graphic characters, plus the space (' ') character. |
| |
| This function can be implemented using |
| |
| iswctype (wc, wctype ("print")) |
| |
| It is declared in 'wctype.h'. |
| |
| -- Function: int iswpunct (wint_t WC) |
| Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX |
| Safety Concepts::. |
| |
| Returns true if WC is a punctuation character. This means any |
| printing character that is not alphanumeric or a space character. |
| |
| This function can be implemented using |
| |
| iswctype (wc, wctype ("punct")) |
| |
| It is declared in 'wctype.h'. |
| |
| -- Function: int iswspace (wint_t WC) |
| Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX |
| Safety Concepts::. |
| |
| Returns true if WC is a "whitespace" character. In the standard |
| '"C"' locale, 'iswspace' returns true for only the standard |
| whitespace characters: |
| |
| 'L' '' |
| space |
| |
| 'L'\f'' |
| formfeed |
| |
| 'L'\n'' |
| newline |
| |
| 'L'\r'' |
| carriage return |
| |
| 'L'\t'' |
| horizontal tab |
| |
| 'L'\v'' |
| vertical tab |
| |
| This function can be implemented using |
| |
| iswctype (wc, wctype ("space")) |
| |
| It is declared in 'wctype.h'. |
| |
| -- Function: int iswupper (wint_t WC) |
| Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX |
| Safety Concepts::. |
| |
| Returns true if WC is an upper-case letter. The letter need not be |
| from the Latin alphabet, any alphabet representable is valid. |
| |
| This function can be implemented using |
| |
| iswctype (wc, wctype ("upper")) |
| |
| It is declared in 'wctype.h'. |
| |
| -- Function: int iswxdigit (wint_t WC) |
| Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX |
| Safety Concepts::. |
| |
| Returns true if WC is a hexadecimal digit. Hexadecimal digits |
| include the normal decimal digits '0' through '9' and the letters |
| 'A' through 'F' and 'a' through 'f'. |
| |
| This function can be implemented using |
| |
| iswctype (wc, wctype ("xdigit")) |
| |
| It is declared in 'wctype.h'. |
| |
| The GNU C Library also provides a function which is not defined in |
| the ISO C standard but which is available as a version for single byte |
| characters as well. |
| |
| -- Function: int iswblank (wint_t WC) |
| Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX |
| Safety Concepts::. |
| |
| Returns true if WC is a blank character; that is, a space or a tab. |
| This function was originally a GNU extension, but was added in |
| ISO C99. It is declared in 'wchar.h'. |
| |
| |
| File: libc.info, Node: Using Wide Char Classes, Next: Wide Character Case Conversion, Prev: Classification of Wide Characters, Up: Character Handling |
| |
| 4.4 Notes on using the wide character classes |
| ============================================= |
| |
| The first note is probably not astonishing but still occasionally a |
| cause of problems. The 'iswXXX' functions can be implemented using |
| macros and in fact, the GNU C Library does this. They are still |
| available as real functions but when the 'wctype.h' header is included |
| the macros will be used. This is the same as the 'char' type versions |
| of these functions. |
| |
| The second note covers something new. It can be best illustrated by |
| a (real-world) example. The first piece of code is an excerpt from the |
| original code. It is truncated a bit but the intention should be clear. |
| |
| int |
| is_in_class (int c, const char *class) |
| { |
| if (strcmp (class, "alnum") == 0) |
| return isalnum (c); |
| if (strcmp (class, "alpha") == 0) |
| return isalpha (c); |
| if (strcmp (class, "cntrl") == 0) |
| return iscntrl (c); |
| ... |
| return 0; |
| } |
| |
| Now, with the 'wctype' and 'iswctype' you can avoid the 'if' |
| cascades, but rewriting the code as follows is wrong: |
| |
| int |
| is_in_class (int c, const char *class) |
| { |
| wctype_t desc = wctype (class); |
| return desc ? iswctype ((wint_t) c, desc) : 0; |
| } |
| |
| The problem is that it is not guaranteed that the wide character |
| representation of a single-byte character can be found using casting. |
| In fact, usually this fails miserably. The correct solution to this |
| problem is to write the code as follows: |
| |
| int |
| is_in_class (int c, const char *class) |
| { |
| wctype_t desc = wctype (class); |
| return desc ? iswctype (btowc (c), desc) : 0; |
| } |
| |
| *Note Converting a Character::, for more information on 'btowc'. |
| Note that this change probably does not improve the performance of the |
| program a lot since the 'wctype' function still has to make the string |
| comparisons. It gets really interesting if the 'is_in_class' function |
| is called more than once for the same class name. In this case the |
| variable DESC could be computed once and reused for all the calls. |
| Therefore the above form of the function is probably not the final one. |
| |
| |
| File: libc.info, Node: Wide Character Case Conversion, Prev: Using Wide Char Classes, Up: Character Handling |
| |
| 4.5 Mapping of wide characters. |
| =============================== |
| |
| The classification functions are also generalized by the ISO C standard. |
| Instead of just allowing the two standard mappings, a locale can contain |
| others. Again, the 'localedef' program already supports generating such |
| locale data files. |
| |
| -- Data Type: wctrans_t |
| This data type is defined as a scalar type which can hold a value |
| representing the locale-dependent character mapping. There is no |
| way to construct such a value apart from using the return value of |
| the 'wctrans' function. |
| |
| This type is defined in 'wctype.h'. |
| |
| -- Function: wctrans_t wctrans (const char *PROPERTY) |
| Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX |
| Safety Concepts::. |
| |
| The 'wctrans' function has to be used to find out whether a named |
| mapping is defined in the current locale selected for the |
| 'LC_CTYPE' category. If the returned value is non-zero, you can |
| use it afterwards in calls to 'towctrans'. If the return value is |
| zero no such mapping is known in the current locale. |
| |
| Beside locale-specific mappings there are two mappings which are |
| guaranteed to be available in every locale: |
| |
| '"tolower"' '"toupper"' |
| |
| These functions are declared in 'wctype.h'. |
| |
| -- Function: wint_t towctrans (wint_t WC, wctrans_t DESC) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| 'towctrans' maps the input character WC according to the rules of |
| the mapping for which DESC is a descriptor, and returns the value |
| it finds. DESC must be obtained by a successful call to 'wctrans'. |
| |
| This function is declared in 'wctype.h'. |
| |
| For the generally available mappings, the ISO C standard defines |
| convenient shortcuts so that it is not necessary to call 'wctrans' for |
| them. |
| |
| -- Function: wint_t towlower (wint_t WC) |
| Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX |
| Safety Concepts::. |
| |
| If WC is an upper-case letter, 'towlower' returns the corresponding |
| lower-case letter. If WC is not an upper-case letter, WC is |
| returned unchanged. |
| |
| 'towlower' can be implemented using |
| |
| towctrans (wc, wctrans ("tolower")) |
| |
| This function is declared in 'wctype.h'. |
| |
| -- Function: wint_t towupper (wint_t WC) |
| Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX |
| Safety Concepts::. |
| |
| If WC is a lower-case letter, 'towupper' returns the corresponding |
| upper-case letter. Otherwise WC is returned unchanged. |
| |
| 'towupper' can be implemented using |
| |
| towctrans (wc, wctrans ("toupper")) |
| |
| This function is declared in 'wctype.h'. |
| |
| The same warnings given in the last section for the use of the wide |
| character classification functions apply here. It is not possible to |
| simply cast a 'char' type value to a 'wint_t' and use it as an argument |
| to 'towctrans' calls. |
| |
| |
| File: libc.info, Node: String and Array Utilities, Next: Character Set Handling, Prev: Character Handling, Up: Top |
| |
| 5 String and Array Utilities |
| **************************** |
| |
| Operations on strings (or arrays of characters) are an important part of |
| many programs. The GNU C Library provides an extensive set of string |
| utility functions, including functions for copying, concatenating, |
| comparing, and searching strings. Many of these functions can also |
| operate on arbitrary regions of storage; for example, the 'memcpy' |
| function can be used to copy the contents of any kind of array. |
| |
| It's fairly common for beginning C programmers to "reinvent the |
| wheel" by duplicating this functionality in their own code, but it pays |
| to become familiar with the library functions and to make use of them, |
| since this offers benefits in maintenance, efficiency, and portability. |
| |
| For instance, you could easily compare one string to another in two |
| lines of C code, but if you use the built-in 'strcmp' function, you're |
| less likely to make a mistake. And, since these library functions are |
| typically highly optimized, your program may run faster too. |
| |
| * Menu: |
| |
| * Representation of Strings:: Introduction to basic concepts. |
| * String/Array Conventions:: Whether to use a string function or an |
| arbitrary array function. |
| * String Length:: Determining the length of a string. |
| * Copying and Concatenation:: Functions to copy the contents of strings |
| and arrays. |
| * String/Array Comparison:: Functions for byte-wise and character-wise |
| comparison. |
| * Collation Functions:: Functions for collating strings. |
| * Search Functions:: Searching for a specific element or substring. |
| * Finding Tokens in a String:: Splitting a string into tokens by looking |
| for delimiters. |
| * strfry:: Function for flash-cooking a string. |
| * Trivial Encryption:: Obscuring data. |
| * Encode Binary Data:: Encoding and Decoding of Binary Data. |
| * Argz and Envz Vectors:: Null-separated string vectors. |
| |
| |
| File: libc.info, Node: Representation of Strings, Next: String/Array Conventions, Up: String and Array Utilities |
| |
| 5.1 Representation of Strings |
| ============================= |
| |
| This section is a quick summary of string concepts for beginning C |
| programmers. It describes how character strings are represented in C |
| and some common pitfalls. If you are already familiar with this |
| material, you can skip this section. |
| |
| A "string" is an array of 'char' objects. But string-valued |
| variables are usually declared to be pointers of type 'char *'. Such |
| variables do not include space for the text of a string; that has to be |
| stored somewhere else--in an array variable, a string constant, or |
| dynamically allocated memory (*note Memory Allocation::). It's up to |
| you to store the address of the chosen memory space into the pointer |
| variable. Alternatively you can store a "null pointer" in the pointer |
| variable. The null pointer does not point anywhere, so attempting to |
| reference the string it points to gets an error. |
| |
| "string" normally refers to multibyte character strings as opposed to |
| wide character strings. Wide character strings are arrays of type |
| 'wchar_t' and as for multibyte character strings usually pointers of |
| type 'wchar_t *' are used. |
| |
| By convention, a "null character", ''\0'', marks the end of a |
| multibyte character string and the "null wide character", 'L'\0'', marks |
| the end of a wide character string. For example, in testing to see |
| whether the 'char *' variable P points to a null character marking the |
| end of a string, you can write '!*P' or '*P == '\0''. |
| |
| A null character is quite different conceptually from a null pointer, |
| although both are represented by the integer '0'. |
| |
| "String literals" appear in C program source as strings of characters |
| between double-quote characters ('"') where the initial double-quote |
| character is immediately preceded by a capital 'L' (ell) character (as |
| in 'L"foo"'). In ISO C, string literals can also be formed by "string |
| concatenation": '"a" "b"' is the same as '"ab"'. For wide character |
| strings one can either use 'L"a" L"b"' or 'L"a" "b"'. Modification of |
| string literals is not allowed by the GNU C compiler, because literals |
| are placed in read-only storage. |
| |
| Character arrays that are declared 'const' cannot be modified either. |
| It's generally good style to declare non-modifiable string pointers to |
| be of type 'const char *', since this often allows the C compiler to |
| detect accidental modifications as well as providing some amount of |
| documentation about what your program intends to do with the string. |
| |
| The amount of memory allocated for the character array may extend |
| past the null character that normally marks the end of the string. In |
| this document, the term "allocated size" is always used to refer to the |
| total amount of memory allocated for the string, while the term "length" |
| refers to the number of characters up to (but not including) the |
| terminating null character. |
| |
| A notorious source of program bugs is trying to put more characters |
| in a string than fit in its allocated size. When writing code that |
| extends strings or moves characters into a pre-allocated array, you |
| should be very careful to keep track of the length of the text and make |
| explicit checks for overflowing the array. Many of the library |
| functions _do not_ do this for you! Remember also that you need to |
| allocate an extra byte to hold the null character that marks the end of |
| the string. |
| |
| Originally strings were sequences of bytes where each byte represents |
| a single character. This is still true today if the strings are encoded |
| using a single-byte character encoding. Things are different if the |
| strings are encoded using a multibyte encoding (for more information on |
| encodings see *note Extended Char Intro::). There is no difference in |
| the programming interface for these two kind of strings; the programmer |
| has to be aware of this and interpret the byte sequences accordingly. |
| |
| But since there is no separate interface taking care of these |
| differences the byte-based string functions are sometimes hard to use. |
| Since the count parameters of these functions specify bytes a call to |
| 'strncpy' could cut a multibyte character in the middle and put an |
| incomplete (and therefore unusable) byte sequence in the target buffer. |
| |
| To avoid these problems later versions of the ISO C standard |
| introduce a second set of functions which are operating on "wide |
| characters" (*note Extended Char Intro::). These functions don't have |
| the problems the single-byte versions have since every wide character is |
| a legal, interpretable value. This does not mean that cutting wide |
| character strings at arbitrary points is without problems. It normally |
| is for alphabet-based languages (except for non-normalized text) but |
| languages based on syllables still have the problem that more than one |
| wide character is necessary to complete a logical unit. This is a |
| higher level problem which the C library functions are not designed to |
| solve. But it is at least good that no invalid byte sequences can be |
| created. Also, the higher level functions can also much easier operate |
| on wide character than on multibyte characters so that a general advise |
| is to use wide characters internally whenever text is more than simply |
| copied. |
| |
| The remaining of this chapter will discuss the functions for handling |
| wide character strings in parallel with the discussion of the multibyte |
| character strings since there is almost always an exact equivalent |
| available. |
| |
| |
| File: libc.info, Node: String/Array Conventions, Next: String Length, Prev: Representation of Strings, Up: String and Array Utilities |
| |
| 5.2 String and Array Conventions |
| ================================ |
| |
| This chapter describes both functions that work on arbitrary arrays or |
| blocks of memory, and functions that are specific to null-terminated |
| arrays of characters and wide characters. |
| |
| Functions that operate on arbitrary blocks of memory have names |
| beginning with 'mem' and 'wmem' (such as 'memcpy' and 'wmemcpy') and |
| invariably take an argument which specifies the size (in bytes and wide |
| characters respectively) of the block of memory to operate on. The |
| array arguments and return values for these functions have type 'void *' |
| or 'wchar_t'. As a matter of style, the elements of the arrays used |
| with the 'mem' functions are referred to as "bytes". You can pass any |
| kind of pointer to these functions, and the 'sizeof' operator is useful |
| in computing the value for the size argument. Parameters to the 'wmem' |
| functions must be of type 'wchar_t *'. These functions are not really |
| usable with anything but arrays of this type. |
| |
| In contrast, functions that operate specifically on strings and wide |
| character strings have names beginning with 'str' and 'wcs' respectively |
| (such as 'strcpy' and 'wcscpy') and look for a null character to |
| terminate the string instead of requiring an explicit size argument to |
| be passed. (Some of these functions accept a specified maximum length, |
| but they also check for premature termination with a null character.) |
| The array arguments and return values for these functions have type |
| 'char *' and 'wchar_t *' respectively, and the array elements are |
| referred to as "characters" and "wide characters". |
| |
| In many cases, there are both 'mem' and 'str'/'wcs' versions of a |
| function. The one that is more appropriate to use depends on the exact |
| situation. When your program is manipulating arbitrary arrays or blocks |
| of storage, then you should always use the 'mem' functions. On the |
| other hand, when you are manipulating null-terminated strings it is |
| usually more convenient to use the 'str'/'wcs' functions, unless you |
| already know the length of the string in advance. The 'wmem' functions |
| should be used for wide character arrays with known size. |
| |
| Some of the memory and string functions take single characters as |
| arguments. Since a value of type 'char' is automatically promoted into |
| a value of type 'int' when used as a parameter, the functions are |
| declared with 'int' as the type of the parameter in question. In case |
| of the wide character function the situation is similarly: the parameter |
| type for a single wide character is 'wint_t' and not 'wchar_t'. This |
| would for many implementations not be necessary since the 'wchar_t' is |
| large enough to not be automatically promoted, but since the ISO C |
| standard does not require such a choice of types the 'wint_t' type is |
| used. |
| |
| |
| File: libc.info, Node: String Length, Next: Copying and Concatenation, Prev: String/Array Conventions, Up: String and Array Utilities |
| |
| 5.3 String Length |
| ================= |
| |
| You can get the length of a string using the 'strlen' function. This |
| function is declared in the header file 'string.h'. |
| |
| -- Function: size_t strlen (const char *S) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The 'strlen' function returns the length of the null-terminated |
| string S in bytes. (In other words, it returns the offset of the |
| terminating null character within the array.) |
| |
| For example, |
| strlen ("hello, world") |
| => 12 |
| |
| When applied to a character array, the 'strlen' function returns |
| the length of the string stored there, not its allocated size. You |
| can get the allocated size of the character array that holds a |
| string using the 'sizeof' operator: |
| |
| char string[32] = "hello, world"; |
| sizeof (string) |
| => 32 |
| strlen (string) |
| => 12 |
| |
| But beware, this will not work unless STRING is the character array |
| itself, not a pointer to it. For example: |
| |
| char string[32] = "hello, world"; |
| char *ptr = string; |
| sizeof (string) |
| => 32 |
| sizeof (ptr) |
| => 4 /* (on a machine with 4 byte pointers) */ |
| |
| This is an easy mistake to make when you are working with functions |
| that take string arguments; those arguments are always pointers, |
| not arrays. |
| |
| It must also be noted that for multibyte encoded strings the return |
| value does not have to correspond to the number of characters in |
| the string. To get this value the string can be converted to wide |
| characters and 'wcslen' can be used or something like the following |
| code can be used: |
| |
| /* The input is in 'string'. |
| The length is expected in 'n'. */ |
| { |
| mbstate_t t; |
| char *scopy = string; |
| /* In initial state. */ |
| memset (&t, '\0', sizeof (t)); |
| /* Determine number of characters. */ |
| n = mbsrtowcs (NULL, &scopy, strlen (scopy), &t); |
| } |
| |
| This is cumbersome to do so if the number of characters (as opposed |
| to bytes) is needed often it is better to work with wide |
| characters. |
| |
| The wide character equivalent is declared in 'wchar.h'. |
| |
| -- Function: size_t wcslen (const wchar_t *WS) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The 'wcslen' function is the wide character equivalent to 'strlen'. |
| The return value is the number of wide characters in the wide |
| character string pointed to by WS (this is also the offset of the |
| terminating null wide character of WS). |
| |
| Since there are no multi wide character sequences making up one |
| character the return value is not only the offset in the array, it |
| is also the number of wide characters. |
| |
| This function was introduced in Amendment 1 to ISO C90. |
| |
| -- Function: size_t strnlen (const char *S, size_t MAXLEN) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The 'strnlen' function returns the length of the string S in bytes |
| if this length is smaller than MAXLEN bytes. Otherwise it returns |
| MAXLEN. Therefore this function is equivalent to '(strlen (S) < |
| MAXLEN ? strlen (S) : MAXLEN)' but it is more efficient and works |
| even if the string S is not null-terminated. |
| |
| char string[32] = "hello, world"; |
| strnlen (string, 32) |
| => 12 |
| strnlen (string, 5) |
| => 5 |
| |
| This function is a GNU extension and is declared in 'string.h'. |
| |
| -- Function: size_t wcsnlen (const wchar_t *WS, size_t MAXLEN) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| 'wcsnlen' is the wide character equivalent to 'strnlen'. The |
| MAXLEN parameter specifies the maximum number of wide characters. |
| |
| This function is a GNU extension and is declared in 'wchar.h'. |
| |
| |
| File: libc.info, Node: Copying and Concatenation, Next: String/Array Comparison, Prev: String Length, Up: String and Array Utilities |
| |
| 5.4 Copying and Concatenation |
| ============================= |
| |
| You can use the functions described in this section to copy the contents |
| of strings and arrays, or to append the contents of one string to |
| another. The 'str' and 'mem' functions are declared in the header file |
| 'string.h' while the 'wstr' and 'wmem' functions are declared in the |
| file 'wchar.h'. |
| |
| A helpful way to remember the ordering of the arguments to the |
| functions in this section is that it corresponds to an assignment |
| expression, with the destination array specified to the left of the |
| source array. All of these functions return the address of the |
| destination array. |
| |
| Most of these functions do not work properly if the source and |
| destination arrays overlap. For example, if the beginning of the |
| destination array overlaps the end of the source array, the original |
| contents of that part of the source array may get overwritten before it |
| is copied. Even worse, in the case of the string functions, the null |
| character marking the end of the string may be lost, and the copy |
| function might get stuck in a loop trashing all the memory allocated to |
| your program. |
| |
| All functions that have problems copying between overlapping arrays |
| are explicitly identified in this manual. In addition to functions in |
| this section, there are a few others like 'sprintf' (*note Formatted |
| Output Functions::) and 'scanf' (*note Formatted Input Functions::). |
| |
| -- Function: void * memcpy (void *restrict TO, const void *restrict |
| FROM, size_t SIZE) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The 'memcpy' function copies SIZE bytes from the object beginning |
| at FROM into the object beginning at TO. The behavior of this |
| function is undefined if the two arrays TO and FROM overlap; use |
| 'memmove' instead if overlapping is possible. |
| |
| The value returned by 'memcpy' is the value of TO. |
| |
| Here is an example of how you might use 'memcpy' to copy the |
| contents of an array: |
| |
| struct foo *oldarray, *newarray; |
| int arraysize; |
| ... |
| memcpy (new, old, arraysize * sizeof (struct foo)); |
| |
| -- Function: wchar_t * wmemcpy (wchar_t *restrict WTO, const wchar_t |
| *restrict WFROM, size_t SIZE) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The 'wmemcpy' function copies SIZE wide characters from the object |
| beginning at WFROM into the object beginning at WTO. The behavior |
| of this function is undefined if the two arrays WTO and WFROM |
| overlap; use 'wmemmove' instead if overlapping is possible. |
| |
| The following is a possible implementation of 'wmemcpy' but there |
| are more optimizations possible. |
| |
| wchar_t * |
| wmemcpy (wchar_t *restrict wto, const wchar_t *restrict wfrom, |
| size_t size) |
| { |
| return (wchar_t *) memcpy (wto, wfrom, size * sizeof (wchar_t)); |
| } |
| |
| The value returned by 'wmemcpy' is the value of WTO. |
| |
| This function was introduced in Amendment 1 to ISO C90. |
| |
| -- Function: void * mempcpy (void *restrict TO, const void *restrict |
| FROM, size_t SIZE) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The 'mempcpy' function is nearly identical to the 'memcpy' |
| function. It copies SIZE bytes from the object beginning at 'from' |
| into the object pointed to by TO. But instead of returning the |
| value of TO it returns a pointer to the byte following the last |
| written byte in the object beginning at TO. I.e., the value is |
| '((void *) ((char *) TO + SIZE))'. |
| |
| This function is useful in situations where a number of objects |
| shall be copied to consecutive memory positions. |
| |
| void * |
| combine (void *o1, size_t s1, void *o2, size_t s2) |
| { |
| void *result = malloc (s1 + s2); |
| if (result != NULL) |
| mempcpy (mempcpy (result, o1, s1), o2, s2); |
| return result; |
| } |
| |
| This function is a GNU extension. |
| |
| -- Function: wchar_t * wmempcpy (wchar_t *restrict WTO, const wchar_t |
| *restrict WFROM, size_t SIZE) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The 'wmempcpy' function is nearly identical to the 'wmemcpy' |
| function. It copies SIZE wide characters from the object beginning |
| at 'wfrom' into the object pointed to by WTO. But instead of |
| returning the value of WTO it returns a pointer to the wide |
| character following the last written wide character in the object |
| beginning at WTO. I.e., the value is 'WTO + SIZE'. |
| |
| This function is useful in situations where a number of objects |
| shall be copied to consecutive memory positions. |
| |
| The following is a possible implementation of 'wmemcpy' but there |
| are more optimizations possible. |
| |
| wchar_t * |
| wmempcpy (wchar_t *restrict wto, const wchar_t *restrict wfrom, |
| size_t size) |
| { |
| return (wchar_t *) mempcpy (wto, wfrom, size * sizeof (wchar_t)); |
| } |
| |
| This function is a GNU extension. |
| |
| -- Function: void * memmove (void *TO, const void *FROM, size_t SIZE) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| 'memmove' copies the SIZE bytes at FROM into the SIZE bytes at TO, |
| even if those two blocks of space overlap. In the case of overlap, |
| 'memmove' is careful to copy the original values of the bytes in |
| the block at FROM, including those bytes which also belong to the |
| block at TO. |
| |
| The value returned by 'memmove' is the value of TO. |
| |
| -- Function: wchar_t * wmemmove (wchar_t *WTO, const wchar_t *WFROM, |
| size_t SIZE) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| 'wmemmove' copies the SIZE wide characters at WFROM into the SIZE |
| wide characters at WTO, even if those two blocks of space overlap. |
| In the case of overlap, 'memmove' is careful to copy the original |
| values of the wide characters in the block at WFROM, including |
| those wide characters which also belong to the block at WTO. |
| |
| The following is a possible implementation of 'wmemcpy' but there |
| are more optimizations possible. |
| |
| wchar_t * |
| wmempcpy (wchar_t *restrict wto, const wchar_t *restrict wfrom, |
| size_t size) |
| { |
| return (wchar_t *) mempcpy (wto, wfrom, size * sizeof (wchar_t)); |
| } |
| |
| The value returned by 'wmemmove' is the value of WTO. |
| |
| This function is a GNU extension. |
| |
| -- Function: void * memccpy (void *restrict TO, const void *restrict |
| FROM, int C, size_t SIZE) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This function copies no more than SIZE bytes from FROM to TO, |
| stopping if a byte matching C is found. The return value is a |
| pointer into TO one byte past where C was copied, or a null pointer |
| if no byte matching C appeared in the first SIZE bytes of FROM. |
| |
| -- Function: void * memset (void *BLOCK, int C, size_t SIZE) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This function copies the value of C (converted to an 'unsigned |
| char') into each of the first SIZE bytes of the object beginning at |
| BLOCK. It returns the value of BLOCK. |
| |
| -- Function: wchar_t * wmemset (wchar_t *BLOCK, wchar_t WC, size_t |
| SIZE) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This function copies the value of WC into each of the first SIZE |
| wide characters of the object beginning at BLOCK. It returns the |
| value of BLOCK. |
| |
| -- Function: char * strcpy (char *restrict TO, const char *restrict |
| FROM) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This copies characters from the string FROM (up to and including |
| the terminating null character) into the string TO. Like 'memcpy', |
| this function has undefined results if the strings overlap. The |
| return value is the value of TO. |
| |
| -- Function: wchar_t * wcscpy (wchar_t *restrict WTO, const wchar_t |
| *restrict WFROM) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This copies wide characters from the string WFROM (up to and |
| including the terminating null wide character) into the string WTO. |
| Like 'wmemcpy', this function has undefined results if the strings |
| overlap. The return value is the value of WTO. |
| |
| -- Function: char * strncpy (char *restrict TO, const char *restrict |
| FROM, size_t SIZE) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This function is similar to 'strcpy' but always copies exactly SIZE |
| characters into TO. |
| |
| If the length of FROM is more than SIZE, then 'strncpy' copies just |
| the first SIZE characters. Note that in this case there is no null |
| terminator written into TO. |
| |
| If the length of FROM is less than SIZE, then 'strncpy' copies all |
| of FROM, followed by enough null characters to add up to SIZE |
| characters in all. This behavior is rarely useful, but it is |
| specified by the ISO C standard. |
| |
| The behavior of 'strncpy' is undefined if the strings overlap. |
| |
| Using 'strncpy' as opposed to 'strcpy' is a way to avoid bugs |
| relating to writing past the end of the allocated space for TO. |
| However, it can also make your program much slower in one common |
| case: copying a string which is probably small into a potentially |
| large buffer. In this case, SIZE may be large, and when it is, |
| 'strncpy' will waste a considerable amount of time copying null |
| characters. |
| |
| -- Function: wchar_t * wcsncpy (wchar_t *restrict WTO, const wchar_t |
| *restrict WFROM, size_t SIZE) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This function is similar to 'wcscpy' but always copies exactly SIZE |
| wide characters into WTO. |
| |
| If the length of WFROM is more than SIZE, then 'wcsncpy' copies |
| just the first SIZE wide characters. Note that in this case there |
| is no null terminator written into WTO. |
| |
| If the length of WFROM is less than SIZE, then 'wcsncpy' copies all |
| of WFROM, followed by enough null wide characters to add up to SIZE |
| wide characters in all. This behavior is rarely useful, but it is |
| specified by the ISO C standard. |
| |
| The behavior of 'wcsncpy' is undefined if the strings overlap. |
| |
| Using 'wcsncpy' as opposed to 'wcscpy' is a way to avoid bugs |
| relating to writing past the end of the allocated space for WTO. |
| However, it can also make your program much slower in one common |
| case: copying a string which is probably small into a potentially |
| large buffer. In this case, SIZE may be large, and when it is, |
| 'wcsncpy' will waste a considerable amount of time copying null |
| wide characters. |
| |
| -- Function: char * strdup (const char *S) |
| Preliminary: | MT-Safe | AS-Unsafe heap | AC-Unsafe mem | *Note |
| POSIX Safety Concepts::. |
| |
| This function copies the null-terminated string S into a newly |
| allocated string. The string is allocated using 'malloc'; see |
| *note Unconstrained Allocation::. If 'malloc' cannot allocate |
| space for the new string, 'strdup' returns a null pointer. |
| Otherwise it returns a pointer to the new string. |
| |
| -- Function: wchar_t * wcsdup (const wchar_t *WS) |
| Preliminary: | MT-Safe | AS-Unsafe heap | AC-Unsafe mem | *Note |
| POSIX Safety Concepts::. |
| |
| This function copies the null-terminated wide character string WS |
| into a newly allocated string. The string is allocated using |
| 'malloc'; see *note Unconstrained Allocation::. If 'malloc' cannot |
| allocate space for the new string, 'wcsdup' returns a null pointer. |
| Otherwise it returns a pointer to the new wide character string. |
| |
| This function is a GNU extension. |
| |
| -- Function: char * strndup (const char *S, size_t SIZE) |
| Preliminary: | MT-Safe | AS-Unsafe heap | AC-Unsafe mem | *Note |
| POSIX Safety Concepts::. |
| |
| This function is similar to 'strdup' but always copies at most SIZE |
| characters into the newly allocated string. |
| |
| If the length of S is more than SIZE, then 'strndup' copies just |
| the first SIZE characters and adds a closing null terminator. |
| Otherwise all characters are copied and the string is terminated. |
| |
| This function is different to 'strncpy' in that it always |
| terminates the destination string. |
| |
| 'strndup' is a GNU extension. |
| |
| -- Function: char * stpcpy (char *restrict TO, const char *restrict |
| FROM) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This function is like 'strcpy', except that it returns a pointer to |
| the end of the string TO (that is, the address of the terminating |
| null character 'to + strlen (from)') rather than the beginning. |
| |
| For example, this program uses 'stpcpy' to concatenate 'foo' and |
| 'bar' to produce 'foobar', which it then prints. |
| |
| |
| #include <string.h> |
| #include <stdio.h> |
| |
| int |
| main (void) |
| { |
| char buffer[10]; |
| char *to = buffer; |
| to = stpcpy (to, "foo"); |
| to = stpcpy (to, "bar"); |
| puts (buffer); |
| return 0; |
| } |
| |
| This function is not part of the ISO or POSIX standards, and is not |
| customary on Unix systems, but we did not invent it either. |
| Perhaps it comes from MS-DOG. |
| |
| Its behavior is undefined if the strings overlap. The function is |
| declared in 'string.h'. |
| |
| -- Function: wchar_t * wcpcpy (wchar_t *restrict WTO, const wchar_t |
| *restrict WFROM) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This function is like 'wcscpy', except that it returns a pointer to |
| the end of the string WTO (that is, the address of the terminating |
| null character 'wto + strlen (wfrom)') rather than the beginning. |
| |
| This function is not part of ISO or POSIX but was found useful |
| while developing the GNU C Library itself. |
| |
| The behavior of 'wcpcpy' is undefined if the strings overlap. |
| |
| 'wcpcpy' is a GNU extension and is declared in 'wchar.h'. |
| |
| -- Function: char * stpncpy (char *restrict TO, const char *restrict |
| FROM, size_t SIZE) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This function is similar to 'stpcpy' but copies always exactly SIZE |
| characters into TO. |
| |
| If the length of FROM is more than SIZE, then 'stpncpy' copies just |
| the first SIZE characters and returns a pointer to the character |
| directly following the one which was copied last. Note that in |
| this case there is no null terminator written into TO. |
| |
| If the length of FROM is less than SIZE, then 'stpncpy' copies all |
| of FROM, followed by enough null characters to add up to SIZE |
| characters in all. This behavior is rarely useful, but it is |
| implemented to be useful in contexts where this behavior of the |
| 'strncpy' is used. 'stpncpy' returns a pointer to the _first_ |
| written null character. |
| |
| This function is not part of ISO or POSIX but was found useful |
| while developing the GNU C Library itself. |
| |
| Its behavior is undefined if the strings overlap. The function is |
| declared in 'string.h'. |
| |
| -- Function: wchar_t * wcpncpy (wchar_t *restrict WTO, const wchar_t |
| *restrict WFROM, size_t SIZE) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This function is similar to 'wcpcpy' but copies always exactly |
| WSIZE characters into WTO. |
| |
| If the length of WFROM is more than SIZE, then 'wcpncpy' copies |
| just the first SIZE wide characters and returns a pointer to the |
| wide character directly following the last non-null wide character |
| which was copied last. Note that in this case there is no null |
| terminator written into WTO. |
| |
| If the length of WFROM is less than SIZE, then 'wcpncpy' copies all |
| of WFROM, followed by enough null characters to add up to SIZE |
| characters in all. This behavior is rarely useful, but it is |
| implemented to be useful in contexts where this behavior of the |
| 'wcsncpy' is used. 'wcpncpy' returns a pointer to the _first_ |
| written null character. |
| |
| This function is not part of ISO or POSIX but was found useful |
| while developing the GNU C Library itself. |
| |
| Its behavior is undefined if the strings overlap. |
| |
| 'wcpncpy' is a GNU extension and is declared in 'wchar.h'. |
| |
| -- Macro: char * strdupa (const char *S) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This macro is similar to 'strdup' but allocates the new string |
| using 'alloca' instead of 'malloc' (*note Variable Size |
| Automatic::). This means of course the returned string has the |
| same limitations as any block of memory allocated using 'alloca'. |
| |
| For obvious reasons 'strdupa' is implemented only as a macro; you |
| cannot get the address of this function. Despite this limitation |
| it is a useful function. The following code shows a situation |
| where using 'malloc' would be a lot more expensive. |
| |
| |
| #include <paths.h> |
| #include <string.h> |
| #include <stdio.h> |
| |
| const char path[] = _PATH_STDPATH; |
| |
| int |
| main (void) |
| { |
| char *wr_path = strdupa (path); |
| char *cp = strtok (wr_path, ":"); |
| |
| while (cp != NULL) |
| { |
| puts (cp); |
| cp = strtok (NULL, ":"); |
| } |
| return 0; |
| } |
| |
| Please note that calling 'strtok' using PATH directly is invalid. |
| It is also not allowed to call 'strdupa' in the argument list of |
| 'strtok' since 'strdupa' uses 'alloca' (*note Variable Size |
| Automatic::) can interfere with the parameter passing. |
| |
| This function is only available if GNU CC is used. |
| |
| -- Macro: char * strndupa (const char *S, size_t SIZE) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This function is similar to 'strndup' but like 'strdupa' it |
| allocates the new string using 'alloca' *note Variable Size |
| Automatic::. The same advantages and limitations of 'strdupa' are |
| valid for 'strndupa', too. |
| |
| This function is implemented only as a macro, just like 'strdupa'. |
| Just as 'strdupa' this macro also must not be used inside the |
| parameter list in a function call. |
| |
| 'strndupa' is only available if GNU CC is used. |
| |
| -- Function: char * strcat (char *restrict TO, const char *restrict |
| FROM) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The 'strcat' function is similar to 'strcpy', except that the |
| characters from FROM are concatenated or appended to the end of TO, |
| instead of overwriting it. That is, the first character from FROM |
| overwrites the null character marking the end of TO. |
| |
| An equivalent definition for 'strcat' would be: |
| |
| char * |
| strcat (char *restrict to, const char *restrict from) |
| { |
| strcpy (to + strlen (to), from); |
| return to; |
| } |
| |
| This function has undefined results if the strings overlap. |
| |
| -- Function: wchar_t * wcscat (wchar_t *restrict WTO, const wchar_t |
| *restrict WFROM) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The 'wcscat' function is similar to 'wcscpy', except that the |
| characters from WFROM are concatenated or appended to the end of |
| WTO, instead of overwriting it. That is, the first character from |
| WFROM overwrites the null character marking the end of WTO. |
| |
| An equivalent definition for 'wcscat' would be: |
| |
| wchar_t * |
| wcscat (wchar_t *wto, const wchar_t *wfrom) |
| { |
| wcscpy (wto + wcslen (wto), wfrom); |
| return wto; |
| } |
| |
| This function has undefined results if the strings overlap. |
| |
| Programmers using the 'strcat' or 'wcscat' function (or the following |
| 'strncat' or 'wcsncar' functions for that matter) can easily be |
| recognized as lazy and reckless. In almost all situations the lengths |
| of the participating strings are known (it better should be since how |
| can one otherwise ensure the allocated size of the buffer is |
| sufficient?) Or at least, one could know them if one keeps track of the |
| results of the various function calls. But then it is very inefficient |
| to use 'strcat'/'wcscat'. A lot of time is wasted finding the end of |
| the destination string so that the actual copying can start. This is a |
| common example: |
| |
| /* This function concatenates arbitrarily many strings. The last |
| parameter must be 'NULL'. */ |
| char * |
| concat (const char *str, ...) |
| { |
| va_list ap, ap2; |
| size_t total = 1; |
| const char *s; |
| char *result; |
| |
| va_start (ap, str); |
| va_copy (ap2, ap); |
| |
| /* Determine how much space we need. */ |
| for (s = str; s != NULL; s = va_arg (ap, const char *)) |
| total += strlen (s); |
| |
| va_end (ap); |
| |
| result = (char *) malloc (total); |
| if (result != NULL) |
| { |
| result[0] = '\0'; |
| |
| /* Copy the strings. */ |
| for (s = str; s != NULL; s = va_arg (ap2, const char *)) |
| strcat (result, s); |
| } |
| |
| va_end (ap2); |
| |
| return result; |
| } |
| |
| This looks quite simple, especially the second loop where the strings |
| are actually copied. But these innocent lines hide a major performance |
| penalty. Just imagine that ten strings of 100 bytes each have to be |
| concatenated. For the second string we search the already stored 100 |
| bytes for the end of the string so that we can append the next string. |
| For all strings in total the comparisons necessary to find the end of |
| the intermediate results sums up to 5500! If we combine the copying |
| with the search for the allocation we can write this function more |
| efficient: |
| |
| char * |
| concat (const char *str, ...) |
| { |
| va_list ap; |
| size_t allocated = 100; |
| char *result = (char *) malloc (allocated); |
| |
| if (result != NULL) |
| { |
| char *newp; |
| char *wp; |
| const char *s; |
| |
| va_start (ap, str); |
| |
| wp = result; |
| for (s = str; s != NULL; s = va_arg (ap, const char *)) |
| { |
| size_t len = strlen (s); |
| |
| /* Resize the allocated memory if necessary. */ |
| if (wp + len + 1 > result + allocated) |
| { |
| allocated = (allocated + len) * 2; |
| newp = (char *) realloc (result, allocated); |
| if (newp == NULL) |
| { |
| free (result); |
| return NULL; |
| } |
| wp = newp + (wp - result); |
| result = newp; |
| } |
| |
| wp = mempcpy (wp, s, len); |
| } |
| |
| /* Terminate the result string. */ |
| *wp++ = '\0'; |
| |
| /* Resize memory to the optimal size. */ |
| newp = realloc (result, wp - result); |
| if (newp != NULL) |
| result = newp; |
| |
| va_end (ap); |
| } |
| |
| return result; |
| } |
| |
| With a bit more knowledge about the input strings one could fine-tune |
| the memory allocation. The difference we are pointing to here is that |
| we don't use 'strcat' anymore. We always keep track of the length of |
| the current intermediate result so we can safe us the search for the end |
| of the string and use 'mempcpy'. Please note that we also don't use |
| 'stpcpy' which might seem more natural since we handle with strings. |
| But this is not necessary since we already know the length of the string |
| and therefore can use the faster memory copying function. The example |
| would work for wide characters the same way. |
| |
| Whenever a programmer feels the need to use 'strcat' she or he should |
| think twice and look through the program whether the code cannot be |
| rewritten to take advantage of already calculated results. Again: it is |
| almost always unnecessary to use 'strcat'. |
| |
| -- Function: char * strncat (char *restrict TO, const char *restrict |
| FROM, size_t SIZE) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This function is like 'strcat' except that not more than SIZE |
| characters from FROM are appended to the end of TO. A single null |
| character is also always appended to TO, so the total allocated |
| size of TO must be at least 'SIZE + 1' bytes longer than its |
| initial length. |
| |
| The 'strncat' function could be implemented like this: |
| |
| char * |
| strncat (char *to, const char *from, size_t size) |
| { |
| memcpy (to + strlen (to), from, strnlen (from, size)); |
| to[strlen (to) + strnlen (from, size)] = '\0'; |
| return to; |
| } |
| |
| The behavior of 'strncat' is undefined if the strings overlap. |
| |
| -- Function: wchar_t * wcsncat (wchar_t *restrict WTO, const wchar_t |
| *restrict WFROM, size_t SIZE) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This function is like 'wcscat' except that not more than SIZE |
| characters from FROM are appended to the end of TO. A single null |
| character is also always appended to TO, so the total allocated |
| size of TO must be at least 'SIZE + 1' bytes longer than its |
| initial length. |
| |
| The 'wcsncat' function could be implemented like this: |
| |
| wchar_t * |
| wcsncat (wchar_t *restrict wto, const wchar_t *restrict wfrom, |
| size_t size) |
| { |
| memcpy (wto + wcslen (wto), wfrom, wcsnlen (wfrom, size) * sizeof (wchar_t)); |
| wto[wcslen (to) + wcsnlen (wfrom, size)] = '\0'; |
| return wto; |
| } |
| |
| The behavior of 'wcsncat' is undefined if the strings overlap. |
| |
| Here is an example showing the use of 'strncpy' and 'strncat' (the |
| wide character version is equivalent). Notice how, in the call to |
| 'strncat', the SIZE parameter is computed to avoid overflowing the |
| character array 'buffer'. |
| |
| |
| #include <string.h> |
| #include <stdio.h> |
| |
| #define SIZE 10 |
| |
| static char buffer[SIZE]; |
| |
| int |
| main (void) |
| { |
| strncpy (buffer, "hello", SIZE); |
| puts (buffer); |
| strncat (buffer, ", world", SIZE - strlen (buffer) - 1); |
| puts (buffer); |
| } |
| |
| The output produced by this program looks like: |
| |
| hello |
| hello, wo |
| |
| -- Function: void bcopy (const void *FROM, void *TO, size_t SIZE) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This is a partially obsolete alternative for 'memmove', derived |
| from BSD. Note that it is not quite equivalent to 'memmove', |
| because the arguments are not in the same order and there is no |
| return value. |
| |
| -- Function: void bzero (void *BLOCK, size_t SIZE) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This is a partially obsolete alternative for 'memset', derived from |
| BSD. Note that it is not as general as 'memset', because the only |
| value it can store is zero. |
| |
| |
| File: libc.info, Node: String/Array Comparison, Next: Collation Functions, Prev: Copying and Concatenation, Up: String and Array Utilities |
| |
| 5.5 String/Array Comparison |
| =========================== |
| |
| You can use the functions in this section to perform comparisons on the |
| contents of strings and arrays. As well as checking for equality, these |
| functions can also be used as the ordering functions for sorting |
| operations. *Note Searching and Sorting::, for an example of this. |
| |
| Unlike most comparison operations in C, the string comparison |
| functions return a nonzero value if the strings are _not_ equivalent |
| rather than if they are. The sign of the value indicates the relative |
| ordering of the first characters in the strings that are not equivalent: |
| a negative value indicates that the first string is "less" than the |
| second, while a positive value indicates that the first string is |
| "greater". |
| |
| The most common use of these functions is to check only for equality. |
| This is canonically done with an expression like '! strcmp (s1, s2)'. |
| |
| All of these functions are declared in the header file 'string.h'. |
| |
| -- Function: int memcmp (const void *A1, const void *A2, size_t SIZE) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The function 'memcmp' compares the SIZE bytes of memory beginning |
| at A1 against the SIZE bytes of memory beginning at A2. The value |
| returned has the same sign as the difference between the first |
| differing pair of bytes (interpreted as 'unsigned char' objects, |
| then promoted to 'int'). |
| |
| If the contents of the two blocks are equal, 'memcmp' returns '0'. |
| |
| -- Function: int wmemcmp (const wchar_t *A1, const wchar_t *A2, size_t |
| SIZE) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The function 'wmemcmp' compares the SIZE wide characters beginning |
| at A1 against the SIZE wide characters beginning at A2. The value |
| returned is smaller than or larger than zero depending on whether |
| the first differing wide character is A1 is smaller or larger than |
| the corresponding character in A2. |
| |
| If the contents of the two blocks are equal, 'wmemcmp' returns '0'. |
| |
| On arbitrary arrays, the 'memcmp' function is mostly useful for |
| testing equality. It usually isn't meaningful to do byte-wise ordering |
| comparisons on arrays of things other than bytes. For example, a |
| byte-wise comparison on the bytes that make up floating-point numbers |
| isn't likely to tell you anything about the relationship between the |
| values of the floating-point numbers. |
| |
| 'wmemcmp' is really only useful to compare arrays of type 'wchar_t' |
| since the function looks at 'sizeof (wchar_t)' bytes at a time and this |
| number of bytes is system dependent. |
| |
| You should also be careful about using 'memcmp' to compare objects |
| that can contain "holes", such as the padding inserted into structure |
| objects to enforce alignment requirements, extra space at the end of |
| unions, and extra characters at the ends of strings whose length is less |
| than their allocated size. The contents of these "holes" are |
| indeterminate and may cause strange behavior when performing byte-wise |
| comparisons. For more predictable results, perform an explicit |
| component-wise comparison. |
| |
| For example, given a structure type definition like: |
| |
| struct foo |
| { |
| unsigned char tag; |
| union |
| { |
| double f; |
| long i; |
| char *p; |
| } value; |
| }; |
| |
| you are better off writing a specialized comparison function to compare |
| 'struct foo' objects instead of comparing them with 'memcmp'. |
| |
| -- Function: int strcmp (const char *S1, const char *S2) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The 'strcmp' function compares the string S1 against S2, returning |
| a value that has the same sign as the difference between the first |
| differing pair of characters (interpreted as 'unsigned char' |
| objects, then promoted to 'int'). |
| |
| If the two strings are equal, 'strcmp' returns '0'. |
| |
| A consequence of the ordering used by 'strcmp' is that if S1 is an |
| initial substring of S2, then S1 is considered to be "less than" |
| S2. |
| |
| 'strcmp' does not take sorting conventions of the language the |
| strings are written in into account. To get that one has to use |
| 'strcoll'. |
| |
| -- Function: int wcscmp (const wchar_t *WS1, const wchar_t *WS2) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The 'wcscmp' function compares the wide character string WS1 |
| against WS2. The value returned is smaller than or larger than |
| zero depending on whether the first differing wide character is WS1 |
| is smaller or larger than the corresponding character in WS2. |
| |
| If the two strings are equal, 'wcscmp' returns '0'. |
| |
| A consequence of the ordering used by 'wcscmp' is that if WS1 is an |
| initial substring of WS2, then WS1 is considered to be "less than" |
| WS2. |
| |
| 'wcscmp' does not take sorting conventions of the language the |
| strings are written in into account. To get that one has to use |
| 'wcscoll'. |
| |
| -- Function: int strcasecmp (const char *S1, const char *S2) |
| Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX |
| Safety Concepts::. |
| |
| This function is like 'strcmp', except that differences in case are |
| ignored. How uppercase and lowercase characters are related is |
| determined by the currently selected locale. In the standard '"C"' |
| locale the characters A" and a" do not match but in a locale which |
| regards these characters as parts of the alphabet they do match. |
| |
| 'strcasecmp' is derived from BSD. |
| |
| -- Function: int wcscasecmp (const wchar_t *WS1, const wchar_t *WS2) |
| Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX |
| Safety Concepts::. |
| |
| This function is like 'wcscmp', except that differences in case are |
| ignored. How uppercase and lowercase characters are related is |
| determined by the currently selected locale. In the standard '"C"' |
| locale the characters A" and a" do not match but in a locale which |
| regards these characters as parts of the alphabet they do match. |
| |
| 'wcscasecmp' is a GNU extension. |
| |
| -- Function: int strncmp (const char *S1, const char *S2, size_t SIZE) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This function is the similar to 'strcmp', except that no more than |
| SIZE characters are compared. In other words, if the two strings |
| are the same in their first SIZE characters, the return value is |
| zero. |
| |
| -- Function: int wcsncmp (const wchar_t *WS1, const wchar_t *WS2, |
| size_t SIZE) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This function is the similar to 'wcscmp', except that no more than |
| SIZE wide characters are compared. In other words, if the two |
| strings are the same in their first SIZE wide characters, the |
| return value is zero. |
| |
| -- Function: int strncasecmp (const char *S1, const char *S2, size_t N) |
| Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX |
| Safety Concepts::. |
| |
| This function is like 'strncmp', except that differences in case |
| are ignored. Like 'strcasecmp', it is locale dependent how |
| uppercase and lowercase characters are related. |
| |
| 'strncasecmp' is a GNU extension. |
| |
| -- Function: int wcsncasecmp (const wchar_t *WS1, const wchar_t *S2, |
| size_t N) |
| Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX |
| Safety Concepts::. |
| |
| This function is like 'wcsncmp', except that differences in case |
| are ignored. Like 'wcscasecmp', it is locale dependent how |
| uppercase and lowercase characters are related. |
| |
| 'wcsncasecmp' is a GNU extension. |
| |
| Here are some examples showing the use of 'strcmp' and 'strncmp' |
| (equivalent examples can be constructed for the wide character |
| functions). These examples assume the use of the ASCII character set. |
| (If some other character set--say, EBCDIC--is used instead, then the |
| glyphs are associated with different numeric codes, and the return |
| values and ordering may differ.) |
| |
| strcmp ("hello", "hello") |
| => 0 /* These two strings are the same. */ |
| strcmp ("hello", "Hello") |
| => 32 /* Comparisons are case-sensitive. */ |
| strcmp ("hello", "world") |
| => -15 /* The character ''h'' comes before ''w''. */ |
| strcmp ("hello", "hello, world") |
| => -44 /* Comparing a null character against a comma. */ |
| strncmp ("hello", "hello, world", 5) |
| => 0 /* The initial 5 characters are the same. */ |
| strncmp ("hello, world", "hello, stupid world!!!", 5) |
| => 0 /* The initial 5 characters are the same. */ |
| |
| -- Function: int strverscmp (const char *S1, const char *S2) |
| Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX |
| Safety Concepts::. |
| |
| The 'strverscmp' function compares the string S1 against S2, |
| considering them as holding indices/version numbers. The return |
| value follows the same conventions as found in the 'strcmp' |
| function. In fact, if S1 and S2 contain no digits, 'strverscmp' |
| behaves like 'strcmp'. |
| |
| Basically, we compare strings normally (character by character), |
| until we find a digit in each string - then we enter a special |
| comparison mode, where each sequence of digits is taken as a whole. |
| If we reach the end of these two parts without noticing a |
| difference, we return to the standard comparison mode. There are |
| two types of numeric parts: "integral" and "fractional" (those |
| begin with a '0'). The types of the numeric parts affect the way |
| we sort them: |
| |
| * integral/integral: we compare values as you would expect. |
| |
| * fractional/integral: the fractional part is less than the |
| integral one. Again, no surprise. |
| |
| * fractional/fractional: the things become a bit more complex. |
| If the common prefix contains only leading zeroes, the longest |
| part is less than the other one; else the comparison behaves |
| normally. |
| |
| strverscmp ("no digit", "no digit") |
| => 0 /* same behavior as strcmp. */ |
| strverscmp ("item#99", "item#100") |
| => <0 /* same prefix, but 99 < 100. */ |
| strverscmp ("alpha1", "alpha001") |
| => >0 /* fractional part inferior to integral one. */ |
| strverscmp ("part1_f012", "part1_f01") |
| => >0 /* two fractional parts. */ |
| strverscmp ("foo.009", "foo.0") |
| => <0 /* idem, but with leading zeroes only. */ |
| |
| This function is especially useful when dealing with filename |
| sorting, because filenames frequently hold indices/version numbers. |
| |
| 'strverscmp' is a GNU extension. |
| |
| -- Function: int bcmp (const void *A1, const void *A2, size_t SIZE) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This is an obsolete alias for 'memcmp', derived from BSD. |
| |
| |
| File: libc.info, Node: Collation Functions, Next: Search Functions, Prev: String/Array Comparison, Up: String and Array Utilities |
| |
| 5.6 Collation Functions |
| ======================= |
| |
| In some locales, the conventions for lexicographic ordering differ from |
| the strict numeric ordering of character codes. For example, in Spanish |
| most glyphs with diacritical marks such as accents are not considered |
| distinct letters for the purposes of collation. On the other hand, the |
| two-character sequence 'll' is treated as a single letter that is |
| collated immediately after 'l'. |
| |
| You can use the functions 'strcoll' and 'strxfrm' (declared in the |
| headers file 'string.h') and 'wcscoll' and 'wcsxfrm' (declared in the |
| headers file 'wchar') to compare strings using a collation ordering |
| appropriate for the current locale. The locale used by these functions |
| in particular can be specified by setting the locale for the |
| 'LC_COLLATE' category; see *note Locales::. |
| |
| In the standard C locale, the collation sequence for 'strcoll' is the |
| same as that for 'strcmp'. Similarly, 'wcscoll' and 'wcscmp' are the |
| same in this situation. |
| |
| Effectively, the way these functions work is by applying a mapping to |
| transform the characters in a string to a byte sequence that represents |
| the string's position in the collating sequence of the current locale. |
| Comparing two such byte sequences in a simple fashion is equivalent to |
| comparing the strings with the locale's collating sequence. |
| |
| The functions 'strcoll' and 'wcscoll' perform this translation |
| implicitly, in order to do one comparison. By contrast, 'strxfrm' and |
| 'wcsxfrm' perform the mapping explicitly. If you are making multiple |
| comparisons using the same string or set of strings, it is likely to be |
| more efficient to use 'strxfrm' or 'wcsxfrm' to transform all the |
| strings just once, and subsequently compare the transformed strings with |
| 'strcmp' or 'wcscmp'. |
| |
| -- Function: int strcoll (const char *S1, const char *S2) |
| Preliminary: | MT-Safe locale | AS-Unsafe heap | AC-Unsafe mem | |
| *Note POSIX Safety Concepts::. |
| |
| The 'strcoll' function is similar to 'strcmp' but uses the |
| collating sequence of the current locale for collation (the |
| 'LC_COLLATE' locale). |
| |
| -- Function: int wcscoll (const wchar_t *WS1, const wchar_t *WS2) |
| Preliminary: | MT-Safe locale | AS-Unsafe heap | AC-Unsafe mem | |
| *Note POSIX Safety Concepts::. |
| |
| The 'wcscoll' function is similar to 'wcscmp' but uses the |
| collating sequence of the current locale for collation (the |
| 'LC_COLLATE' locale). |
| |
| Here is an example of sorting an array of strings, using 'strcoll' to |
| compare them. The actual sort algorithm is not written here; it comes |
| from 'qsort' (*note Array Sort Function::). The job of the code shown |
| here is to say how to compare the strings while sorting them. (Later on |
| in this section, we will show a way to do this more efficiently using |
| 'strxfrm'.) |
| |
| /* This is the comparison function used with 'qsort'. */ |
| |
| int |
| compare_elements (const void *v1, const void *v2) |
| { |
| char * const *p1 = v1; |
| char * const *p2 = v2; |
| |
| return strcoll (*p1, *p2); |
| } |
| |
| /* This is the entry point--the function to sort |
| strings using the locale's collating sequence. */ |
| |
| void |
| sort_strings (char **array, int nstrings) |
| { |
| /* Sort 'temp_array' by comparing the strings. */ |
| qsort (array, nstrings, |
| sizeof (char *), compare_elements); |
| } |
| |
| -- Function: size_t strxfrm (char *restrict TO, const char *restrict |
| FROM, size_t SIZE) |
| Preliminary: | MT-Safe locale | AS-Unsafe heap | AC-Unsafe mem | |
| *Note POSIX Safety Concepts::. |
| |
| The function 'strxfrm' transforms the string FROM using the |
| collation transformation determined by the locale currently |
| selected for collation, and stores the transformed string in the |
| array TO. Up to SIZE characters (including a terminating null |
| character) are stored. |
| |
| The behavior is undefined if the strings TO and FROM overlap; see |
| *note Copying and Concatenation::. |
| |
| The return value is the length of the entire transformed string. |
| This value is not affected by the value of SIZE, but if it is |
| greater or equal than SIZE, it means that the transformed string |
| did not entirely fit in the array TO. In this case, only as much |
| of the string as actually fits was stored. To get the whole |
| transformed string, call 'strxfrm' again with a bigger output |
| array. |
| |
| The transformed string may be longer than the original string, and |
| it may also be shorter. |
| |
| If SIZE is zero, no characters are stored in TO. In this case, |
| 'strxfrm' simply returns the number of characters that would be the |
| length of the transformed string. This is useful for determining |
| what size the allocated array should be. It does not matter what |
| TO is if SIZE is zero; TO may even be a null pointer. |
| |
| -- Function: size_t wcsxfrm (wchar_t *restrict WTO, const wchar_t |
| *WFROM, size_t SIZE) |
| Preliminary: | MT-Safe locale | AS-Unsafe heap | AC-Unsafe mem | |
| *Note POSIX Safety Concepts::. |
| |
| The function 'wcsxfrm' transforms wide character string WFROM using |
| the collation transformation determined by the locale currently |
| selected for collation, and stores the transformed string in the |
| array WTO. Up to SIZE wide characters (including a terminating |
| null character) are stored. |
| |
| The behavior is undefined if the strings WTO and WFROM overlap; see |
| *note Copying and Concatenation::. |
| |
| The return value is the length of the entire transformed wide |
| character string. This value is not affected by the value of SIZE, |
| but if it is greater or equal than SIZE, it means that the |
| transformed wide character string did not entirely fit in the array |
| WTO. In this case, only as much of the wide character string as |
| actually fits was stored. To get the whole transformed wide |
| character string, call 'wcsxfrm' again with a bigger output array. |
| |
| The transformed wide character string may be longer than the |
| original wide character string, and it may also be shorter. |
| |
| If SIZE is zero, no characters are stored in TO. In this case, |
| 'wcsxfrm' simply returns the number of wide characters that would |
| be the length of the transformed wide character string. This is |
| useful for determining what size the allocated array should be |
| (remember to multiply with 'sizeof (wchar_t)'). It does not matter |
| what WTO is if SIZE is zero; WTO may even be a null pointer. |
| |
| Here is an example of how you can use 'strxfrm' when you plan to do |
| many comparisons. It does the same thing as the previous example, but |
| much faster, because it has to transform each string only once, no |
| matter how many times it is compared with other strings. Even the time |
| needed to allocate and free storage is much less than the time we save, |
| when there are many strings. |
| |
| struct sorter { char *input; char *transformed; }; |
| |
| /* This is the comparison function used with 'qsort' |
| to sort an array of 'struct sorter'. */ |
| |
| int |
| compare_elements (const void *v1, const void *v2) |
| { |
| const struct sorter *p1 = v1; |
| const struct sorter *p2 = v2; |
| |
| return strcmp (p1->transformed, p2->transformed); |
| } |
| |
| /* This is the entry point--the function to sort |
| strings using the locale's collating sequence. */ |
| |
| void |
| sort_strings_fast (char **array, int nstrings) |
| { |
| struct sorter temp_array[nstrings]; |
| int i; |
| |
| /* Set up 'temp_array'. Each element contains |
| one input string and its transformed string. */ |
| for (i = 0; i < nstrings; i++) |
| { |
| size_t length = strlen (array[i]) * 2; |
| char *transformed; |
| size_t transformed_length; |
| |
| temp_array[i].input = array[i]; |
| |
| /* First try a buffer perhaps big enough. */ |
| transformed = (char *) xmalloc (length); |
| |
| /* Transform 'array[i]'. */ |
| transformed_length = strxfrm (transformed, array[i], length); |
| |
| /* If the buffer was not large enough, resize it |
| and try again. */ |
| if (transformed_length >= length) |
| { |
| /* Allocate the needed space. +1 for terminating |
| 'NUL' character. */ |
| transformed = (char *) xrealloc (transformed, |
| transformed_length + 1); |
| |
| /* The return value is not interesting because we know |
| how long the transformed string is. */ |
| (void) strxfrm (transformed, array[i], |
| transformed_length + 1); |
| } |
| |
| temp_array[i].transformed = transformed; |
| } |
| |
| /* Sort 'temp_array' by comparing transformed strings. */ |
| qsort (temp_array, sizeof (struct sorter), |
| nstrings, compare_elements); |
| |
| /* Put the elements back in the permanent array |
| in their sorted order. */ |
| for (i = 0; i < nstrings; i++) |
| array[i] = temp_array[i].input; |
| |
| /* Free the strings we allocated. */ |
| for (i = 0; i < nstrings; i++) |
| free (temp_array[i].transformed); |
| } |
| |
| The interesting part of this code for the wide character version |
| would look like this: |
| |
| void |
| sort_strings_fast (wchar_t **array, int nstrings) |
| { |
| ... |
| /* Transform 'array[i]'. */ |
| transformed_length = wcsxfrm (transformed, array[i], length); |
| |
| /* If the buffer was not large enough, resize it |
| and try again. */ |
| if (transformed_length >= length) |
| { |
| /* Allocate the needed space. +1 for terminating |
| 'NUL' character. */ |
| transformed = (wchar_t *) xrealloc (transformed, |
| (transformed_length + 1) |
| * sizeof (wchar_t)); |
| |
| /* The return value is not interesting because we know |
| how long the transformed string is. */ |
| (void) wcsxfrm (transformed, array[i], |
| transformed_length + 1); |
| } |
| ... |
| |
| Note the additional multiplication with 'sizeof (wchar_t)' in the |
| 'realloc' call. |
| |
| *Compatibility Note:* The string collation functions are a new |
| feature of ISO C90. Older C dialects have no equivalent feature. The |
| wide character versions were introduced in Amendment 1 to ISO C90. |
| |
| |
| File: libc.info, Node: Search Functions, Next: Finding Tokens in a String, Prev: Collation Functions, Up: String and Array Utilities |
| |
| 5.7 Search Functions |
| ==================== |
| |
| This section describes library functions which perform various kinds of |
| searching operations on strings and arrays. These functions are |
| declared in the header file 'string.h'. |
| |
| -- Function: void * memchr (const void *BLOCK, int C, size_t SIZE) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This function finds the first occurrence of the byte C (converted |
| to an 'unsigned char') in the initial SIZE bytes of the object |
| beginning at BLOCK. The return value is a pointer to the located |
| byte, or a null pointer if no match was found. |
| |
| -- Function: wchar_t * wmemchr (const wchar_t *BLOCK, wchar_t WC, |
| size_t SIZE) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This function finds the first occurrence of the wide character WC |
| in the initial SIZE wide characters of the object beginning at |
| BLOCK. The return value is a pointer to the located wide |
| character, or a null pointer if no match was found. |
| |
| -- Function: void * rawmemchr (const void *BLOCK, int C) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| Often the 'memchr' function is used with the knowledge that the |
| byte C is available in the memory block specified by the |
| parameters. But this means that the SIZE parameter is not really |
| needed and that the tests performed with it at runtime (to check |
| whether the end of the block is reached) are not needed. |
| |
| The 'rawmemchr' function exists for just this situation which is |
| surprisingly frequent. The interface is similar to 'memchr' except |
| that the SIZE parameter is missing. The function will look beyond |
| the end of the block pointed to by BLOCK in case the programmer |
| made an error in assuming that the byte C is present in the block. |
| In this case the result is unspecified. Otherwise the return value |
| is a pointer to the located byte. |
| |
| This function is of special interest when looking for the end of a |
| string. Since all strings are terminated by a null byte a call |
| like |
| |
| rawmemchr (str, '\0') |
| |
| will never go beyond the end of the string. |
| |
| This function is a GNU extension. |
| |
| -- Function: void * memrchr (const void *BLOCK, int C, size_t SIZE) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The function 'memrchr' is like 'memchr', except that it searches |
| backwards from the end of the block defined by BLOCK and SIZE |
| (instead of forwards from the front). |
| |
| This function is a GNU extension. |
| |
| -- Function: char * strchr (const char *STRING, int C) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The 'strchr' function finds the first occurrence of the character C |
| (converted to a 'char') in the null-terminated string beginning at |
| STRING. The return value is a pointer to the located character, or |
| a null pointer if no match was found. |
| |
| For example, |
| strchr ("hello, world", 'l') |
| => "llo, world" |
| strchr ("hello, world", '?') |
| => NULL |
| |
| The terminating null character is considered to be part of the |
| string, so you can use this function get a pointer to the end of a |
| string by specifying a null character as the value of the C |
| argument. |
| |
| When 'strchr' returns a null pointer, it does not let you know the |
| position of the terminating null character it has found. If you |
| need that information, it is better (but less portable) to use |
| 'strchrnul' than to search for it a second time. |
| |
| -- Function: wchar_t * wcschr (const wchar_t *WSTRING, int WC) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The 'wcschr' function finds the first occurrence of the wide |
| character WC in the null-terminated wide character string beginning |
| at WSTRING. The return value is a pointer to the located wide |
| character, or a null pointer if no match was found. |
| |
| The terminating null character is considered to be part of the wide |
| character string, so you can use this function get a pointer to the |
| end of a wide character string by specifying a null wude character |
| as the value of the WC argument. It would be better (but less |
| portable) to use 'wcschrnul' in this case, though. |
| |
| -- Function: char * strchrnul (const char *STRING, int C) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| 'strchrnul' is the same as 'strchr' except that if it does not find |
| the character, it returns a pointer to string's terminating null |
| character rather than a null pointer. |
| |
| This function is a GNU extension. |
| |
| -- Function: wchar_t * wcschrnul (const wchar_t *WSTRING, wchar_t WC) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| 'wcschrnul' is the same as 'wcschr' except that if it does not find |
| the wide character, it returns a pointer to wide character string's |
| terminating null wide character rather than a null pointer. |
| |
| This function is a GNU extension. |
| |
| One useful, but unusual, use of the 'strchr' function is when one |
| wants to have a pointer pointing to the NUL byte terminating a string. |
| This is often written in this way: |
| |
| s += strlen (s); |
| |
| This is almost optimal but the addition operation duplicated a bit of |
| the work already done in the 'strlen' function. A better solution is |
| this: |
| |
| s = strchr (s, '\0'); |
| |
| There is no restriction on the second parameter of 'strchr' so it |
| could very well also be the NUL character. Those readers thinking very |
| hard about this might now point out that the 'strchr' function is more |
| expensive than the 'strlen' function since we have two abort criteria. |
| This is right. But in the GNU C Library the implementation of 'strchr' |
| is optimized in a special way so that 'strchr' actually is faster. |
| |
| -- Function: char * strrchr (const char *STRING, int C) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The function 'strrchr' is like 'strchr', except that it searches |
| backwards from the end of the string STRING (instead of forwards |
| from the front). |
| |
| For example, |
| strrchr ("hello, world", 'l') |
| => "ld" |
| |
| -- Function: wchar_t * wcsrchr (const wchar_t *WSTRING, wchar_t C) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The function 'wcsrchr' is like 'wcschr', except that it searches |
| backwards from the end of the string WSTRING (instead of forwards |
| from the front). |
| |
| -- Function: char * strstr (const char *HAYSTACK, const char *NEEDLE) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This is like 'strchr', except that it searches HAYSTACK for a |
| substring NEEDLE rather than just a single character. It returns a |
| pointer into the string HAYSTACK that is the first character of the |
| substring, or a null pointer if no match was found. If NEEDLE is |
| an empty string, the function returns HAYSTACK. |
| |
| For example, |
| strstr ("hello, world", "l") |
| => "llo, world" |
| strstr ("hello, world", "wo") |
| => "world" |
| |
| -- Function: wchar_t * wcsstr (const wchar_t *HAYSTACK, const wchar_t |
| *NEEDLE) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This is like 'wcschr', except that it searches HAYSTACK for a |
| substring NEEDLE rather than just a single wide character. It |
| returns a pointer into the string HAYSTACK that is the first wide |
| character of the substring, or a null pointer if no match was |
| found. If NEEDLE is an empty string, the function returns |
| HAYSTACK. |
| |
| -- Function: wchar_t * wcswcs (const wchar_t *HAYSTACK, const wchar_t |
| *NEEDLE) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| 'wcswcs' is a deprecated alias for 'wcsstr'. This is the name |
| originally used in the X/Open Portability Guide before the Amendment 1 |
| to ISO C90 was published. |
| |
| -- Function: char * strcasestr (const char *HAYSTACK, const char |
| *NEEDLE) |
| Preliminary: | MT-Safe locale | AS-Safe | AC-Safe | *Note POSIX |
| Safety Concepts::. |
| |
| This is like 'strstr', except that it ignores case in searching for |
| the substring. Like 'strcasecmp', it is locale dependent how |
| uppercase and lowercase characters are related. |
| |
| For example, |
| strcasestr ("hello, world", "L") |
| => "llo, world" |
| strcasestr ("hello, World", "wo") |
| => "World" |
| |
| -- Function: void * memmem (const void *HAYSTACK, size_t HAYSTACK-LEN, |
| const void *NEEDLE, size_t NEEDLE-LEN) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This is like 'strstr', but NEEDLE and HAYSTACK are byte arrays |
| rather than null-terminated strings. NEEDLE-LEN is the length of |
| NEEDLE and HAYSTACK-LEN is the length of HAYSTACK. |
| |
| This function is a GNU extension. |
| |
| -- Function: size_t strspn (const char *STRING, const char *SKIPSET) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The 'strspn' ("string span") function returns the length of the |
| initial substring of STRING that consists entirely of characters |
| that are members of the set specified by the string SKIPSET. The |
| order of the characters in SKIPSET is not important. |
| |
| For example, |
| strspn ("hello, world", "abcdefghijklmnopqrstuvwxyz") |
| => 5 |
| |
| Note that "character" is here used in the sense of byte. In a |
| string using a multibyte character encoding (abstract) character |
| consisting of more than one byte are not treated as an entity. |
| Each byte is treated separately. The function is not |
| locale-dependent. |
| |
| -- Function: size_t wcsspn (const wchar_t *WSTRING, const wchar_t |
| *SKIPSET) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The 'wcsspn' ("wide character string span") function returns the |
| length of the initial substring of WSTRING that consists entirely |
| of wide characters that are members of the set specified by the |
| string SKIPSET. The order of the wide characters in SKIPSET is not |
| important. |
| |
| -- Function: size_t strcspn (const char *STRING, const char *STOPSET) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The 'strcspn' ("string complement span") function returns the |
| length of the initial substring of STRING that consists entirely of |
| characters that are _not_ members of the set specified by the |
| string STOPSET. (In other words, it returns the offset of the |
| first character in STRING that is a member of the set STOPSET.) |
| |
| For example, |
| strcspn ("hello, world", " \t\n,.;!?") |
| => 5 |
| |
| Note that "character" is here used in the sense of byte. In a |
| string using a multibyte character encoding (abstract) character |
| consisting of more than one byte are not treated as an entity. |
| Each byte is treated separately. The function is not |
| locale-dependent. |
| |
| -- Function: size_t wcscspn (const wchar_t *WSTRING, const wchar_t |
| *STOPSET) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The 'wcscspn' ("wide character string complement span") function |
| returns the length of the initial substring of WSTRING that |
| consists entirely of wide characters that are _not_ members of the |
| set specified by the string STOPSET. (In other words, it returns |
| the offset of the first character in STRING that is a member of the |
| set STOPSET.) |
| |
| -- Function: char * strpbrk (const char *STRING, const char *STOPSET) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The 'strpbrk' ("string pointer break") function is related to |
| 'strcspn', except that it returns a pointer to the first character |
| in STRING that is a member of the set STOPSET instead of the length |
| of the initial substring. It returns a null pointer if no such |
| character from STOPSET is found. |
| |
| For example, |
| |
| strpbrk ("hello, world", " \t\n,.;!?") |
| => ", world" |
| |
| Note that "character" is here used in the sense of byte. In a |
| string using a multibyte character encoding (abstract) character |
| consisting of more than one byte are not treated as an entity. |
| Each byte is treated separately. The function is not |
| locale-dependent. |
| |
| -- Function: wchar_t * wcspbrk (const wchar_t *WSTRING, const wchar_t |
| *STOPSET) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The 'wcspbrk' ("wide character string pointer break") function is |
| related to 'wcscspn', except that it returns a pointer to the first |
| wide character in WSTRING that is a member of the set STOPSET |
| instead of the length of the initial substring. It returns a null |
| pointer if no such character from STOPSET is found. |
| |
| 5.7.1 Compatibility String Search Functions |
| ------------------------------------------- |
| |
| -- Function: char * index (const char *STRING, int C) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| 'index' is another name for 'strchr'; they are exactly the same. |
| New code should always use 'strchr' since this name is defined in ISO C |
| while 'index' is a BSD invention which never was available on System V |
| derived systems. |
| |
| -- Function: char * rindex (const char *STRING, int C) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| 'rindex' is another name for 'strrchr'; they are exactly the same. |
| New code should always use 'strrchr' since this name is defined in ISO C |
| while 'rindex' is a BSD invention which never was available on System V |
| derived systems. |
| |
| |
| File: libc.info, Node: Finding Tokens in a String, Next: strfry, Prev: Search Functions, Up: String and Array Utilities |
| |
| 5.8 Finding Tokens in a String |
| ============================== |
| |
| It's fairly common for programs to have a need to do some simple kinds |
| of lexical analysis and parsing, such as splitting a command string up |
| into tokens. You can do this with the 'strtok' function, declared in |
| the header file 'string.h'. |
| |
| -- Function: char * strtok (char *restrict NEWSTRING, const char |
| *restrict DELIMITERS) |
| Preliminary: | MT-Unsafe race:strtok | AS-Unsafe | AC-Safe | *Note |
| POSIX Safety Concepts::. |
| |
| A string can be split into tokens by making a series of calls to |
| the function 'strtok'. |
| |
| The string to be split up is passed as the NEWSTRING argument on |
| the first call only. The 'strtok' function uses this to set up |
| some internal state information. Subsequent calls to get |
| additional tokens from the same string are indicated by passing a |
| null pointer as the NEWSTRING argument. Calling 'strtok' with |
| another non-null NEWSTRING argument reinitializes the state |
| information. It is guaranteed that no other library function ever |
| calls 'strtok' behind your back (which would mess up this internal |
| state information). |
| |
| The DELIMITERS argument is a string that specifies a set of |
| delimiters that may surround the token being extracted. All the |
| initial characters that are members of this set are discarded. The |
| first character that is _not_ a member of this set of delimiters |
| marks the beginning of the next token. The end of the token is |
| found by looking for the next character that is a member of the |
| delimiter set. This character in the original string NEWSTRING is |
| overwritten by a null character, and the pointer to the beginning |
| of the token in NEWSTRING is returned. |
| |
| On the next call to 'strtok', the searching begins at the next |
| character beyond the one that marked the end of the previous token. |
| Note that the set of delimiters DELIMITERS do not have to be the |
| same on every call in a series of calls to 'strtok'. |
| |
| If the end of the string NEWSTRING is reached, or if the remainder |
| of string consists only of delimiter characters, 'strtok' returns a |
| null pointer. |
| |
| Note that "character" is here used in the sense of byte. In a |
| string using a multibyte character encoding (abstract) character |
| consisting of more than one byte are not treated as an entity. |
| Each byte is treated separately. The function is not |
| locale-dependent. |
| |
| -- Function: wchar_t * wcstok (wchar_t *NEWSTRING, const wchar_t |
| *DELIMITERS, wchar_t **SAVE_PTR) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| A string can be split into tokens by making a series of calls to |
| the function 'wcstok'. |
| |
| The string to be split up is passed as the NEWSTRING argument on |
| the first call only. The 'wcstok' function uses this to set up |
| some internal state information. Subsequent calls to get |
| additional tokens from the same wide character string are indicated |
| by passing a null pointer as the NEWSTRING argument, which causes |
| the pointer previously stored in SAVE_PTR to be used instead. |
| |
| The DELIMITERS argument is a wide character string that specifies a |
| set of delimiters that may surround the token being extracted. All |
| the initial wide characters that are members of this set are |
| discarded. The first wide character that is _not_ a member of this |
| set of delimiters marks the beginning of the next token. The end |
| of the token is found by looking for the next wide character that |
| is a member of the delimiter set. This wide character in the |
| original wide character string NEWSTRING is overwritten by a null |
| wide character, the pointer past the overwritten wide character is |
| saved in SAVE_PTR, and the pointer to the beginning of the token in |
| NEWSTRING is returned. |
| |
| On the next call to 'wcstok', the searching begins at the next wide |
| character beyond the one that marked the end of the previous token. |
| Note that the set of delimiters DELIMITERS do not have to be the |
| same on every call in a series of calls to 'wcstok'. |
| |
| If the end of the wide character string NEWSTRING is reached, or if |
| the remainder of string consists only of delimiter wide characters, |
| 'wcstok' returns a null pointer. |
| |
| *Warning:* Since 'strtok' and 'wcstok' alter the string they is |
| parsing, you should always copy the string to a temporary buffer before |
| parsing it with 'strtok'/'wcstok' (*note Copying and Concatenation::). |
| If you allow 'strtok' or 'wcstok' to modify a string that came from |
| another part of your program, you are asking for trouble; that string |
| might be used for other purposes after 'strtok' or 'wcstok' has modified |
| it, and it would not have the expected value. |
| |
| The string that you are operating on might even be a constant. Then |
| when 'strtok' or 'wcstok' tries to modify it, your program will get a |
| fatal signal for writing in read-only memory. *Note Program Error |
| Signals::. Even if the operation of 'strtok' or 'wcstok' would not |
| require a modification of the string (e.g., if there is exactly one |
| token) the string can (and in the GNU C Library case will) be modified. |
| |
| This is a special case of a general principle: if a part of a program |
| does not have as its purpose the modification of a certain data |
| structure, then it is error-prone to modify the data structure |
| temporarily. |
| |
| The function 'strtok' is not reentrant, whereas 'wcstok' is. *Note |
| Nonreentrancy::, for a discussion of where and why reentrancy is |
| important. |
| |
| Here is a simple example showing the use of 'strtok'. |
| |
| #include <string.h> |
| #include <stddef.h> |
| |
| ... |
| |
| const char string[] = "words separated by spaces -- and, punctuation!"; |
| const char delimiters[] = " .,;:!-"; |
| char *token, *cp; |
| |
| ... |
| |
| cp = strdupa (string); /* Make writable copy. */ |
| token = strtok (cp, delimiters); /* token => "words" */ |
| token = strtok (NULL, delimiters); /* token => "separated" */ |
| token = strtok (NULL, delimiters); /* token => "by" */ |
| token = strtok (NULL, delimiters); /* token => "spaces" */ |
| token = strtok (NULL, delimiters); /* token => "and" */ |
| token = strtok (NULL, delimiters); /* token => "punctuation" */ |
| token = strtok (NULL, delimiters); /* token => NULL */ |
| |
| The GNU C Library contains two more functions for tokenizing a string |
| which overcome the limitation of non-reentrancy. They are only |
| available for multibyte character strings. |
| |
| -- Function: char * strtok_r (char *NEWSTRING, const char *DELIMITERS, |
| char **SAVE_PTR) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| Just like 'strtok', this function splits the string into several |
| tokens which can be accessed by successive calls to 'strtok_r'. |
| The difference is that, as in 'wcstok', the information about the |
| next token is stored in the space pointed to by the third argument, |
| SAVE_PTR, which is a pointer to a string pointer. Calling |
| 'strtok_r' with a null pointer for NEWSTRING and leaving SAVE_PTR |
| between the calls unchanged does the job without hindering |
| reentrancy. |
| |
| This function is defined in POSIX.1 and can be found on many |
| systems which support multi-threading. |
| |
| -- Function: char * strsep (char **STRING_PTR, const char *DELIMITER) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This function has a similar functionality as 'strtok_r' with the |
| NEWSTRING argument replaced by the SAVE_PTR argument. The |
| initialization of the moving pointer has to be done by the user. |
| Successive calls to 'strsep' move the pointer along the tokens |
| separated by DELIMITER, returning the address of the next token and |
| updating STRING_PTR to point to the beginning of the next token. |
| |
| One difference between 'strsep' and 'strtok_r' is that if the input |
| string contains more than one character from DELIMITER in a row |
| 'strsep' returns an empty string for each pair of characters from |
| DELIMITER. This means that a program normally should test for |
| 'strsep' returning an empty string before processing it. |
| |
| This function was introduced in 4.3BSD and therefore is widely |
| available. |
| |
| Here is how the above example looks like when 'strsep' is used. |
| |
| #include <string.h> |
| #include <stddef.h> |
| |
| ... |
| |
| const char string[] = "words separated by spaces -- and, punctuation!"; |
| const char delimiters[] = " .,;:!-"; |
| char *running; |
| char *token; |
| |
| ... |
| |
| running = strdupa (string); |
| token = strsep (&running, delimiters); /* token => "words" */ |
| token = strsep (&running, delimiters); /* token => "separated" */ |
| token = strsep (&running, delimiters); /* token => "by" */ |
| token = strsep (&running, delimiters); /* token => "spaces" */ |
| token = strsep (&running, delimiters); /* token => "" */ |
| token = strsep (&running, delimiters); /* token => "" */ |
| token = strsep (&running, delimiters); /* token => "" */ |
| token = strsep (&running, delimiters); /* token => "and" */ |
| token = strsep (&running, delimiters); /* token => "" */ |
| token = strsep (&running, delimiters); /* token => "punctuation" */ |
| token = strsep (&running, delimiters); /* token => "" */ |
| token = strsep (&running, delimiters); /* token => NULL */ |
| |
| -- Function: char * basename (const char *FILENAME) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The GNU version of the 'basename' function returns the last |
| component of the path in FILENAME. This function is the preferred |
| usage, since it does not modify the argument, FILENAME, and |
| respects trailing slashes. The prototype for 'basename' can be |
| found in 'string.h'. Note, this function is overriden by the XPG |
| version, if 'libgen.h' is included. |
| |
| Example of using GNU 'basename': |
| |
| #include <string.h> |
| |
| int |
| main (int argc, char *argv[]) |
| { |
| char *prog = basename (argv[0]); |
| |
| if (argc < 2) |
| { |
| fprintf (stderr, "Usage %s <arg>\n", prog); |
| exit (1); |
| } |
| |
| ... |
| } |
| |
| *Portability Note:* This function may produce different results on |
| different systems. |
| |
| -- Function: char * basename (const char *PATH) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| This is the standard XPG defined 'basename'. It is similar in |
| spirit to the GNU version, but may modify the PATH by removing |
| trailing '/' characters. If the PATH is made up entirely of '/' |
| characters, then "/" will be returned. Also, if PATH is 'NULL' or |
| an empty string, then "." is returned. The prototype for the XPG |
| version can be found in 'libgen.h'. |
| |
| Example of using XPG 'basename': |
| |
| #include <libgen.h> |
| |
| int |
| main (int argc, char *argv[]) |
| { |
| char *prog; |
| char *path = strdupa (argv[0]); |
| |
| prog = basename (path); |
| |
| if (argc < 2) |
| { |
| fprintf (stderr, "Usage %s <arg>\n", prog); |
| exit (1); |
| } |
| |
| ... |
| |
| } |
| |
| -- Function: char * dirname (char *PATH) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The 'dirname' function is the compliment to the XPG version of |
| 'basename'. It returns the parent directory of the file specified |
| by PATH. If PATH is 'NULL', an empty string, or contains no '/' |
| characters, then "." is returned. The prototype for this function |
| can be found in 'libgen.h'. |
| |
| |
| File: libc.info, Node: strfry, Next: Trivial Encryption, Prev: Finding Tokens in a String, Up: String and Array Utilities |
| |
| 5.9 strfry |
| ========== |
| |
| The function below addresses the perennial programming quandary: "How do |
| I take good data in string form and painlessly turn it into garbage?" |
| This is actually a fairly simple task for C programmers who do not use |
| the GNU C Library string functions, but for programs based on the GNU C |
| Library, the 'strfry' function is the preferred method for destroying |
| string data. |
| |
| The prototype for this function is in 'string.h'. |
| |
| -- Function: char * strfry (char *STRING) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| 'strfry' creates a pseudorandom anagram of a string, replacing the |
| input with the anagram in place. For each position in the string, |
| 'strfry' swaps it with a position in the string selected at random |
| (from a uniform distribution). The two positions may be the same. |
| |
| The return value of 'strfry' is always STRING. |
| |
| *Portability Note:* This function is unique to the GNU C Library. |
| |
| |
| File: libc.info, Node: Trivial Encryption, Next: Encode Binary Data, Prev: strfry, Up: String and Array Utilities |
| |
| 5.10 Trivial Encryption |
| ======================= |
| |
| The 'memfrob' function converts an array of data to something |
| unrecognizable and back again. It is not encryption in its usual sense |
| since it is easy for someone to convert the encrypted data back to clear |
| text. The transformation is analogous to Usenet's "Rot13" encryption |
| method for obscuring offensive jokes from sensitive eyes and such. |
| Unlike Rot13, 'memfrob' works on arbitrary binary data, not just text. |
| |
| For true encryption, *Note Cryptographic Functions::. |
| |
| This function is declared in 'string.h'. |
| |
| -- Function: void * memfrob (void *MEM, size_t LENGTH) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| 'memfrob' transforms (frobnicates) each byte of the data structure |
| at MEM, which is LENGTH bytes long, by bitwise exclusive oring it |
| with binary 00101010. It does the transformation in place and its |
| return value is always MEM. |
| |
| Note that 'memfrob' a second time on the same data structure |
| returns it to its original state. |
| |
| This is a good function for hiding information from someone who |
| doesn't want to see it or doesn't want to see it very much. To |
| really prevent people from retrieving the information, use stronger |
| encryption such as that described in *Note Cryptographic |
| Functions::. |
| |
| *Portability Note:* This function is unique to the GNU C Library. |
| |
| |
| File: libc.info, Node: Encode Binary Data, Next: Argz and Envz Vectors, Prev: Trivial Encryption, Up: String and Array Utilities |
| |
| 5.11 Encode Binary Data |
| ======================= |
| |
| To store or transfer binary data in environments which only support text |
| one has to encode the binary data by mapping the input bytes to |
| characters in the range allowed for storing or transferring. SVID |
| systems (and nowadays XPG compliant systems) provide minimal support for |
| this task. |
| |
| -- Function: char * l64a (long int N) |
| Preliminary: | MT-Unsafe race:l64a | AS-Unsafe | AC-Safe | *Note |
| POSIX Safety Concepts::. |
| |
| This function encodes a 32-bit input value using characters from |
| the basic character set. It returns a pointer to a 7 character |
| buffer which contains an encoded version of N. To encode a series |
| of bytes the user must copy the returned string to a destination |
| buffer. It returns the empty string if N is zero, which is |
| somewhat bizarre but mandated by the standard. |
| *Warning:* Since a static buffer is used this function should not |
| be used in multi-threaded programs. There is no thread-safe |
| alternative to this function in the C library. |
| *Compatibility Note:* The XPG standard states that the return value |
| of 'l64a' is undefined if N is negative. In the GNU |
| implementation, 'l64a' treats its argument as unsigned, so it will |
| return a sensible encoding for any nonzero N; however, portable |
| programs should not rely on this. |
| |
| To encode a large buffer 'l64a' must be called in a loop, once for |
| each 32-bit word of the buffer. For example, one could do |
| something like this: |
| |
| char * |
| encode (const void *buf, size_t len) |
| { |
| /* We know in advance how long the buffer has to be. */ |
| unsigned char *in = (unsigned char *) buf; |
| char *out = malloc (6 + ((len + 3) / 4) * 6 + 1); |
| char *cp = out, *p; |
| |
| /* Encode the length. */ |
| /* Using 'htonl' is necessary so that the data can be |
| decoded even on machines with different byte order. |
| 'l64a' can return a string shorter than 6 bytes, so |
| we pad it with encoding of 0 ('.') at the end by |
| hand. */ |
| |
| p = stpcpy (cp, l64a (htonl (len))); |
| cp = mempcpy (p, "......", 6 - (p - cp)); |
| |
| while (len > 3) |
| { |
| unsigned long int n = *in++; |
| n = (n << 8) | *in++; |
| n = (n << 8) | *in++; |
| n = (n << 8) | *in++; |
| len -= 4; |
| p = stpcpy (cp, l64a (htonl (n))); |
| cp = mempcpy (p, "......", 6 - (p - cp)); |
| } |
| if (len > 0) |
| { |
| unsigned long int n = *in++; |
| if (--len > 0) |
| { |
| n = (n << 8) | *in++; |
| if (--len > 0) |
| n = (n << 8) | *in; |
| } |
| cp = stpcpy (cp, l64a (htonl (n))); |
| } |
| *cp = '\0'; |
| return out; |
| } |
| |
| It is strange that the library does not provide the complete |
| functionality needed but so be it. |
| |
| To decode data produced with 'l64a' the following function should be |
| used. |
| |
| -- Function: long int a64l (const char *STRING) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The parameter STRING should contain a string which was produced by |
| a call to 'l64a'. The function processes at least 6 characters of |
| this string, and decodes the characters it finds according to the |
| table below. It stops decoding when it finds a character not in |
| the table, rather like 'atoi'; if you have a buffer which has been |
| broken into lines, you must be careful to skip over the end-of-line |
| characters. |
| |
| The decoded number is returned as a 'long int' value. |
| |
| The 'l64a' and 'a64l' functions use a base 64 encoding, in which each |
| character of an encoded string represents six bits of an input word. |
| These symbols are used for the base 64 digits: |
| |
| 0 1 2 3 4 5 6 7 |
| 0 '.' '/' '0' '1' '2' '3' '4' '5' |
| 8 '6' '7' '8' '9' 'A' 'B' 'C' 'D' |
| 16 'E' 'F' 'G' 'H' 'I' 'J' 'K' 'L' |
| 24 'M' 'N' 'O' 'P' 'Q' 'R' 'S' 'T' |
| 32 'U' 'V' 'W' 'X' 'Y' 'Z' 'a' 'b' |
| 40 'c' 'd' 'e' 'f' 'g' 'h' 'i' 'j' |
| 48 'k' 'l' 'm' 'n' 'o' 'p' 'q' 'r' |
| 56 's' 't' 'u' 'v' 'w' 'x' 'y' 'z' |
| |
| This encoding scheme is not standard. There are some other encoding |
| methods which are much more widely used (UU encoding, MIME encoding). |
| Generally, it is better to use one of these encodings. |
| |
| |
| File: libc.info, Node: Argz and Envz Vectors, Prev: Encode Binary Data, Up: String and Array Utilities |
| |
| 5.12 Argz and Envz Vectors |
| ========================== |
| |
| "argz vectors" are vectors of strings in a contiguous block of memory, |
| each element separated from its neighbors by null-characters (''\0''). |
| |
| "Envz vectors" are an extension of argz vectors where each element is |
| a name-value pair, separated by a ''='' character (as in a Unix |
| environment). |
| |
| * Menu: |
| |
| * Argz Functions:: Operations on argz vectors. |
| * Envz Functions:: Additional operations on environment vectors. |
| |
| |
| File: libc.info, Node: Argz Functions, Next: Envz Functions, Up: Argz and Envz Vectors |
| |
| 5.12.1 Argz Functions |
| --------------------- |
| |
| Each argz vector is represented by a pointer to the first element, of |
| type 'char *', and a size, of type 'size_t', both of which can be |
| initialized to '0' to represent an empty argz vector. All argz |
| functions accept either a pointer and a size argument, or pointers to |
| them, if they will be modified. |
| |
| The argz functions use 'malloc'/'realloc' to allocate/grow argz |
| vectors, and so any argz vector creating using these functions may be |
| freed by using 'free'; conversely, any argz function that may grow a |
| string expects that string to have been allocated using 'malloc' (those |
| argz functions that only examine their arguments or modify them in place |
| will work on any sort of memory). *Note Unconstrained Allocation::. |
| |
| All argz functions that do memory allocation have a return type of |
| 'error_t', and return '0' for success, and 'ENOMEM' if an allocation |
| error occurs. |
| |
| These functions are declared in the standard include file 'argz.h'. |
| |
| -- Function: error_t argz_create (char *const ARGV[], char **ARGZ, |
| size_t *ARGZ_LEN) |
| Preliminary: | MT-Safe | AS-Unsafe heap | AC-Unsafe mem | *Note |
| POSIX Safety Concepts::. |
| |
| The 'argz_create' function converts the Unix-style argument vector |
| ARGV (a vector of pointers to normal C strings, terminated by |
| '(char *)0'; *note Program Arguments::) into an argz vector with |
| the same elements, which is returned in ARGZ and ARGZ_LEN. |
| |
| -- Function: error_t argz_create_sep (const char *STRING, int SEP, char |
| **ARGZ, size_t *ARGZ_LEN) |
| Preliminary: | MT-Safe | AS-Unsafe heap | AC-Unsafe mem | *Note |
| POSIX Safety Concepts::. |
| |
| The 'argz_create_sep' function converts the null-terminated string |
| STRING into an argz vector (returned in ARGZ and ARGZ_LEN) by |
| splitting it into elements at every occurrence of the character |
| SEP. |
| |
| -- Function: size_t argz_count (const char *ARGZ, size_t ARG_LEN) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| Returns the number of elements in the argz vector ARGZ and |
| ARGZ_LEN. |
| |
| -- Function: void argz_extract (const char *ARGZ, size_t ARGZ_LEN, char |
| **ARGV) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The 'argz_extract' function converts the argz vector ARGZ and |
| ARGZ_LEN into a Unix-style argument vector stored in ARGV, by |
| putting pointers to every element in ARGZ into successive positions |
| in ARGV, followed by a terminator of '0'. ARGV must be |
| pre-allocated with enough space to hold all the elements in ARGZ |
| plus the terminating '(char *)0' ('(argz_count (ARGZ, ARGZ_LEN) + |
| 1) * sizeof (char *)' bytes should be enough). Note that the |
| string pointers stored into ARGV point into ARGZ--they are not |
| copies--and so ARGZ must be copied if it will be changed while ARGV |
| is still active. This function is useful for passing the elements |
| in ARGZ to an exec function (*note Executing a File::). |
| |
| -- Function: void argz_stringify (char *ARGZ, size_t LEN, int SEP) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The 'argz_stringify' converts ARGZ into a normal string with the |
| elements separated by the character SEP, by replacing each ''\0'' |
| inside ARGZ (except the last one, which terminates the string) with |
| SEP. This is handy for printing ARGZ in a readable manner. |
| |
| -- Function: error_t argz_add (char **ARGZ, size_t *ARGZ_LEN, const |
| char *STR) |
| Preliminary: | MT-Safe | AS-Unsafe heap | AC-Unsafe mem | *Note |
| POSIX Safety Concepts::. |
| |
| The 'argz_add' function adds the string STR to the end of the argz |
| vector '*ARGZ', and updates '*ARGZ' and '*ARGZ_LEN' accordingly. |
| |
| -- Function: error_t argz_add_sep (char **ARGZ, size_t *ARGZ_LEN, const |
| char *STR, int DELIM) |
| Preliminary: | MT-Safe | AS-Unsafe heap | AC-Unsafe mem | *Note |
| POSIX Safety Concepts::. |
| |
| The 'argz_add_sep' function is similar to 'argz_add', but STR is |
| split into separate elements in the result at occurrences of the |
| character DELIM. This is useful, for instance, for adding the |
| components of a Unix search path to an argz vector, by using a |
| value of '':'' for DELIM. |
| |
| -- Function: error_t argz_append (char **ARGZ, size_t *ARGZ_LEN, const |
| char *BUF, size_t BUF_LEN) |
| Preliminary: | MT-Safe | AS-Unsafe heap | AC-Unsafe mem | *Note |
| POSIX Safety Concepts::. |
| |
| The 'argz_append' function appends BUF_LEN bytes starting at BUF to |
| the argz vector '*ARGZ', reallocating '*ARGZ' to accommodate it, |
| and adding BUF_LEN to '*ARGZ_LEN'. |
| |
| -- Function: void argz_delete (char **ARGZ, size_t *ARGZ_LEN, char |
| *ENTRY) |
| Preliminary: | MT-Safe | AS-Unsafe heap | AC-Unsafe mem | *Note |
| POSIX Safety Concepts::. |
| |
| If ENTRY points to the beginning of one of the elements in the argz |
| vector '*ARGZ', the 'argz_delete' function will remove this entry |
| and reallocate '*ARGZ', modifying '*ARGZ' and '*ARGZ_LEN' |
| accordingly. Note that as destructive argz functions usually |
| reallocate their argz argument, pointers into argz vectors such as |
| ENTRY will then become invalid. |
| |
| -- Function: error_t argz_insert (char **ARGZ, size_t *ARGZ_LEN, char |
| *BEFORE, const char *ENTRY) |
| Preliminary: | MT-Safe | AS-Unsafe heap | AC-Unsafe mem | *Note |
| POSIX Safety Concepts::. |
| |
| The 'argz_insert' function inserts the string ENTRY into the argz |
| vector '*ARGZ' at a point just before the existing element pointed |
| to by BEFORE, reallocating '*ARGZ' and updating '*ARGZ' and |
| '*ARGZ_LEN'. If BEFORE is '0', ENTRY is added to the end instead |
| (as if by 'argz_add'). Since the first element is in fact the same |
| as '*ARGZ', passing in '*ARGZ' as the value of BEFORE will result |
| in ENTRY being inserted at the beginning. |
| |
| -- Function: char * argz_next (const char *ARGZ, size_t ARGZ_LEN, const |
| char *ENTRY) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The 'argz_next' function provides a convenient way of iterating |
| over the elements in the argz vector ARGZ. It returns a pointer to |
| the next element in ARGZ after the element ENTRY, or '0' if there |
| are no elements following ENTRY. If ENTRY is '0', the first |
| element of ARGZ is returned. |
| |
| This behavior suggests two styles of iteration: |
| |
| char *entry = 0; |
| while ((entry = argz_next (ARGZ, ARGZ_LEN, entry))) |
| ACTION; |
| |
| (the double parentheses are necessary to make some C compilers shut |
| up about what they consider a questionable 'while'-test) and: |
| |
| char *entry; |
| for (entry = ARGZ; |
| entry; |
| entry = argz_next (ARGZ, ARGZ_LEN, entry)) |
| ACTION; |
| |
| Note that the latter depends on ARGZ having a value of '0' if it is |
| empty (rather than a pointer to an empty block of memory); this |
| invariant is maintained for argz vectors created by the functions |
| here. |
| |
| -- Function: error_t argz_replace (char **ARGZ, size_t *ARGZ_LEN, |
| const char *STR, const char *WITH, unsigned *REPLACE_COUNT) |
| Preliminary: | MT-Safe | AS-Unsafe heap | AC-Unsafe mem | *Note |
| POSIX Safety Concepts::. |
| |
| Replace any occurrences of the string STR in ARGZ with WITH, |
| reallocating ARGZ as necessary. If REPLACE_COUNT is non-zero, |
| '*REPLACE_COUNT' will be incremented by number of replacements |
| performed. |
| |
| |
| File: libc.info, Node: Envz Functions, Prev: Argz Functions, Up: Argz and Envz Vectors |
| |
| 5.12.2 Envz Functions |
| --------------------- |
| |
| Envz vectors are just argz vectors with additional constraints on the |
| form of each element; as such, argz functions can also be used on them, |
| where it makes sense. |
| |
| Each element in an envz vector is a name-value pair, separated by a |
| ''='' character; if multiple ''='' characters are present in an element, |
| those after the first are considered part of the value, and treated like |
| all other non-''\0'' characters. |
| |
| If _no_ ''='' characters are present in an element, that element is |
| considered the name of a "null" entry, as distinct from an entry with an |
| empty value: 'envz_get' will return '0' if given the name of null entry, |
| whereas an entry with an empty value would result in a value of '""'; |
| 'envz_entry' will still find such entries, however. Null entries can be |
| removed with 'envz_strip' function. |
| |
| As with argz functions, envz functions that may allocate memory (and |
| thus fail) have a return type of 'error_t', and return either '0' or |
| 'ENOMEM'. |
| |
| These functions are declared in the standard include file 'envz.h'. |
| |
| -- Function: char * envz_entry (const char *ENVZ, size_t ENVZ_LEN, |
| const char *NAME) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The 'envz_entry' function finds the entry in ENVZ with the name |
| NAME, and returns a pointer to the whole entry--that is, the argz |
| element which begins with NAME followed by a ''='' character. If |
| there is no entry with that name, '0' is returned. |
| |
| -- Function: char * envz_get (const char *ENVZ, size_t ENVZ_LEN, const |
| char *NAME) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The 'envz_get' function finds the entry in ENVZ with the name NAME |
| (like 'envz_entry'), and returns a pointer to the value portion of |
| that entry (following the ''=''). If there is no entry with that |
| name (or only a null entry), '0' is returned. |
| |
| -- Function: error_t envz_add (char **ENVZ, size_t *ENVZ_LEN, const |
| char *NAME, const char *VALUE) |
| Preliminary: | MT-Safe | AS-Unsafe heap | AC-Unsafe mem | *Note |
| POSIX Safety Concepts::. |
| |
| The 'envz_add' function adds an entry to '*ENVZ' (updating '*ENVZ' |
| and '*ENVZ_LEN') with the name NAME, and value VALUE. If an entry |
| with the same name already exists in ENVZ, it is removed first. If |
| VALUE is '0', then the new entry will the special null type of |
| entry (mentioned above). |
| |
| -- Function: error_t envz_merge (char **ENVZ, size_t *ENVZ_LEN, const |
| char *ENVZ2, size_t ENVZ2_LEN, int OVERRIDE) |
| Preliminary: | MT-Safe | AS-Unsafe heap | AC-Unsafe mem | *Note |
| POSIX Safety Concepts::. |
| |
| The 'envz_merge' function adds each entry in ENVZ2 to ENVZ, as if |
| with 'envz_add', updating '*ENVZ' and '*ENVZ_LEN'. If OVERRIDE is |
| true, then values in ENVZ2 will supersede those with the same name |
| in ENVZ, otherwise not. |
| |
| Null entries are treated just like other entries in this respect, |
| so a null entry in ENVZ can prevent an entry of the same name in |
| ENVZ2 from being added to ENVZ, if OVERRIDE is false. |
| |
| -- Function: void envz_strip (char **ENVZ, size_t *ENVZ_LEN) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The 'envz_strip' function removes any null entries from ENVZ, |
| updating '*ENVZ' and '*ENVZ_LEN'. |
| |
| |
| File: libc.info, Node: Character Set Handling, Next: Locales, Prev: String and Array Utilities, Up: Top |
| |
| 6 Character Set Handling |
| ************************ |
| |
| Character sets used in the early days of computing had only six, seven, |
| or eight bits for each character: there was never a case where more than |
| eight bits (one byte) were used to represent a single character. The |
| limitations of this approach became more apparent as more people |
| grappled with non-Roman character sets, where not all the characters |
| that make up a language's character set can be represented by 2^8 |
| choices. This chapter shows the functionality that was added to the C |
| library to support multiple character sets. |
| |
| * Menu: |
| |
| * Extended Char Intro:: Introduction to Extended Characters. |
| * Charset Function Overview:: Overview about Character Handling |
| Functions. |
| * Restartable multibyte conversion:: Restartable multibyte conversion |
| Functions. |
| * Non-reentrant Conversion:: Non-reentrant Conversion Function. |
| * Generic Charset Conversion:: Generic Charset Conversion. |
| |
| |
| File: libc.info, Node: Extended Char Intro, Next: Charset Function Overview, Up: Character Set Handling |
| |
| 6.1 Introduction to Extended Characters |
| ======================================= |
| |
| A variety of solutions is available to overcome the differences between |
| character sets with a 1:1 relation between bytes and characters and |
| character sets with ratios of 2:1 or 4:1. The remainder of this section |
| gives a few examples to help understand the design decisions made while |
| developing the functionality of the C library. |
| |
| A distinction we have to make right away is between internal and |
| external representation. "Internal representation" means the |
| representation used by a program while keeping the text in memory. |
| External representations are used when text is stored or transmitted |
| through some communication channel. Examples of external |
| representations include files waiting in a directory to be read and |
| parsed. |
| |
| Traditionally there has been no difference between the two |
| representations. It was equally comfortable and useful to use the same |
| single-byte representation internally and externally. This comfort |
| level decreases with more and larger character sets. |
| |
| One of the problems to overcome with the internal representation is |
| handling text that is externally encoded using different character sets. |
| Assume a program that reads two texts and compares them using some |
| metric. The comparison can be usefully done only if the texts are |
| internally kept in a common format. |
| |
| For such a common format (= character set) eight bits are certainly |
| no longer enough. So the smallest entity will have to grow: "wide |
| characters" will now be used. Instead of one byte per character, two or |
| four will be used instead. (Three are not good to address in memory and |
| more than four bytes seem not to be necessary). |
| |
| As shown in some other part of this manual, a completely new family |
| has been created of functions that can handle wide character texts in |
| memory. The most commonly used character sets for such internal wide |
| character representations are Unicode and ISO 10646 (also known as UCS |
| for Universal Character Set). Unicode was originally planned as a |
| 16-bit character set; whereas, ISO 10646 was designed to be a 31-bit |
| large code space. The two standards are practically identical. They |
| have the same character repertoire and code table, but Unicode specifies |
| added semantics. At the moment, only characters in the first '0x10000' |
| code positions (the so-called Basic Multilingual Plane, BMP) have been |
| assigned, but the assignment of more specialized characters outside this |
| 16-bit space is already in progress. A number of encodings have been |
| defined for Unicode and ISO 10646 characters: UCS-2 is a 16-bit word |
| that can only represent characters from the BMP, UCS-4 is a 32-bit word |
| than can represent any Unicode and ISO 10646 character, UTF-8 is an |
| ASCII compatible encoding where ASCII characters are represented by |
| ASCII bytes and non-ASCII characters by sequences of 2-6 non-ASCII |
| bytes, and finally UTF-16 is an extension of UCS-2 in which pairs of |
| certain UCS-2 words can be used to encode non-BMP characters up to |
| '0x10ffff'. |
| |
| To represent wide characters the 'char' type is not suitable. For |
| this reason the ISO C standard introduces a new type that is designed to |
| keep one character of a wide character string. To maintain the |
| similarity there is also a type corresponding to 'int' for those |
| functions that take a single wide character. |
| |
| -- Data type: wchar_t |
| This data type is used as the base type for wide character strings. |
| In other words, arrays of objects of this type are the equivalent |
| of 'char[]' for multibyte character strings. The type is defined |
| in 'stddef.h'. |
| |
| The ISO C90 standard, where 'wchar_t' was introduced, does not say |
| anything specific about the representation. It only requires that |
| this type is capable of storing all elements of the basic character |
| set. Therefore it would be legitimate to define 'wchar_t' as |
| 'char', which might make sense for embedded systems. |
| |
| But in the GNU C Library 'wchar_t' is always 32 bits wide and, |
| therefore, capable of representing all UCS-4 values and, therefore, |
| covering all of ISO 10646. Some Unix systems define 'wchar_t' as a |
| 16-bit type and thereby follow Unicode very strictly. This |
| definition is perfectly fine with the standard, but it also means |
| that to represent all characters from Unicode and ISO 10646 one has |
| to use UTF-16 surrogate characters, which is in fact a |
| multi-wide-character encoding. But resorting to |
| multi-wide-character encoding contradicts the purpose of the |
| 'wchar_t' type. |
| |
| -- Data type: wint_t |
| 'wint_t' is a data type used for parameters and variables that |
| contain a single wide character. As the name suggests this type is |
| the equivalent of 'int' when using the normal 'char' strings. The |
| types 'wchar_t' and 'wint_t' often have the same representation if |
| their size is 32 bits wide but if 'wchar_t' is defined as 'char' |
| the type 'wint_t' must be defined as 'int' due to the parameter |
| promotion. |
| |
| This type is defined in 'wchar.h' and was introduced in Amendment 1 |
| to ISO C90. |
| |
| As there are for the 'char' data type macros are available for |
| specifying the minimum and maximum value representable in an object of |
| type 'wchar_t'. |
| |
| -- Macro: wint_t WCHAR_MIN |
| The macro 'WCHAR_MIN' evaluates to the minimum value representable |
| by an object of type 'wint_t'. |
| |
| This macro was introduced in Amendment 1 to ISO C90. |
| |
| -- Macro: wint_t WCHAR_MAX |
| The macro 'WCHAR_MAX' evaluates to the maximum value representable |
| by an object of type 'wint_t'. |
| |
| This macro was introduced in Amendment 1 to ISO C90. |
| |
| Another special wide character value is the equivalent to 'EOF'. |
| |
| -- Macro: wint_t WEOF |
| The macro 'WEOF' evaluates to a constant expression of type |
| 'wint_t' whose value is different from any member of the extended |
| character set. |
| |
| 'WEOF' need not be the same value as 'EOF' and unlike 'EOF' it also |
| need _not_ be negative. In other words, sloppy code like |
| |
| { |
| int c; |
| ... |
| while ((c = getc (fp)) < 0) |
| ... |
| } |
| |
| has to be rewritten to use 'WEOF' explicitly when wide characters |
| are used: |
| |
| { |
| wint_t c; |
| ... |
| while ((c = wgetc (fp)) != WEOF) |
| ... |
| } |
| |
| This macro was introduced in Amendment 1 to ISO C90 and is defined |
| in 'wchar.h'. |
| |
| These internal representations present problems when it comes to |
| storing and transmittal. Because each single wide character consists of |
| more than one byte, they are affected by byte-ordering. Thus, machines |
| with different endianesses would see different values when accessing the |
| same data. This byte ordering concern also applies for communication |
| protocols that are all byte-based and therefore require that the sender |
| has to decide about splitting the wide character in bytes. A last (but |
| not least important) point is that wide characters often require more |
| storage space than a customized byte-oriented character set. |
| |
| For all the above reasons, an external encoding that is different |
| from the internal encoding is often used if the latter is UCS-2 or |
| UCS-4. The external encoding is byte-based and can be chosen |
| appropriately for the environment and for the texts to be handled. A |
| variety of different character sets can be used for this external |
| encoding (information that will not be exhaustively presented |
| here-instead, a description of the major groups will suffice). All of |
| the ASCII-based character sets fulfill one requirement: they are |
| "filesystem safe." This means that the character ''/'' is used in the |
| encoding _only_ to represent itself. Things are a bit different for |
| character sets like EBCDIC (Extended Binary Coded Decimal Interchange |
| Code, a character set family used by IBM), but if the operating system |
| does not understand EBCDIC directly the parameters-to-system calls have |
| to be converted first anyhow. |
| |
| * The simplest character sets are single-byte character sets. There |
| can be only up to 256 characters (for 8 bit character sets), which |
| is not sufficient to cover all languages but might be sufficient to |
| handle a specific text. Handling of a 8 bit character sets is |
| simple. This is not true for other kinds presented later, and |
| therefore, the application one uses might require the use of 8 bit |
| character sets. |
| |
| * The ISO 2022 standard defines a mechanism for extended character |
| sets where one character _can_ be represented by more than one |
| byte. This is achieved by associating a state with the text. |
| Characters that can be used to change the state can be embedded in |
| the text. Each byte in the text might have a different |
| interpretation in each state. The state might even influence |
| whether a given byte stands for a character on its own or whether |
| it has to be combined with some more bytes. |
| |
| In most uses of ISO 2022 the defined character sets do not allow |
| state changes that cover more than the next character. This has |
| the big advantage that whenever one can identify the beginning of |
| the byte sequence of a character one can interpret a text |
| correctly. Examples of character sets using this policy are the |
| various EUC character sets (used by Sun's operating systems, |
| EUC-JP, EUC-KR, EUC-TW, and EUC-CN) or Shift_JIS (SJIS, a Japanese |
| encoding). |
| |
| But there are also character sets using a state that is valid for |
| more than one character and has to be changed by another byte |
| sequence. Examples for this are ISO-2022-JP, ISO-2022-KR, and |
| ISO-2022-CN. |
| |
| * Early attempts to fix 8 bit character sets for other languages |
| using the Roman alphabet lead to character sets like ISO 6937. |
| Here bytes representing characters like the acute accent do not |
| produce output themselves: one has to combine them with other |
| characters to get the desired result. For example, the byte |
| sequence '0xc2 0x61' (non-spacing acute accent, followed by |
| lower-case 'a') to get the "small a with acute" character. To get |
| the acute accent character on its own, one has to write '0xc2 0x20' |
| (the non-spacing acute followed by a space). |
| |
| Character sets like ISO 6937 are used in some embedded systems such |
| as teletex. |
| |
| * Instead of converting the Unicode or ISO 10646 text used |
| internally, it is often also sufficient to simply use an encoding |
| different than UCS-2/UCS-4. The Unicode and ISO 10646 standards |
| even specify such an encoding: UTF-8. This encoding is able to |
| represent all of ISO 10646 31 bits in a byte string of length one |
| to six. |
| |
| There were a few other attempts to encode ISO 10646 such as UTF-7, |
| but UTF-8 is today the only encoding that should be used. In fact, |
| with any luck UTF-8 will soon be the only external encoding that |
| has to be supported. It proves to be universally usable and its |
| only disadvantage is that it favors Roman languages by making the |
| byte string representation of other scripts (Cyrillic, Greek, Asian |
| scripts) longer than necessary if using a specific character set |
| for these scripts. Methods like the Unicode compression scheme can |
| alleviate these problems. |
| |
| The question remaining is: how to select the character set or |
| encoding to use. The answer: you cannot decide about it yourself, it is |
| decided by the developers of the system or the majority of the users. |
| Since the goal is interoperability one has to use whatever the other |
| people one works with use. If there are no constraints, the selection |
| is based on the requirements the expected circle of users will have. In |
| other words, if a project is expected to be used in only, say, Russia it |
| is fine to use KOI8-R or a similar character set. But if at the same |
| time people from, say, Greece are participating one should use a |
| character set that allows all people to collaborate. |
| |
| The most widely useful solution seems to be: go with the most general |
| character set, namely ISO 10646. Use UTF-8 as the external encoding and |
| problems about users not being able to use their own language adequately |
| are a thing of the past. |
| |
| One final comment about the choice of the wide character |
| representation is necessary at this point. We have said above that the |
| natural choice is using Unicode or ISO 10646. This is not required, but |
| at least encouraged, by the ISO C standard. The standard defines at |
| least a macro '__STDC_ISO_10646__' that is only defined on systems where |
| the 'wchar_t' type encodes ISO 10646 characters. If this symbol is not |
| defined one should avoid making assumptions about the wide character |
| representation. If the programmer uses only the functions provided by |
| the C library to handle wide character strings there should be no |
| compatibility problems with other systems. |
| |
| |
| File: libc.info, Node: Charset Function Overview, Next: Restartable multibyte conversion, Prev: Extended Char Intro, Up: Character Set Handling |
| |
| 6.2 Overview about Character Handling Functions |
| =============================================== |
| |
| A Unix C library contains three different sets of functions in two |
| families to handle character set conversion. One of the function |
| families (the most commonly used) is specified in the ISO C90 standard |
| and, therefore, is portable even beyond the Unix world. Unfortunately |
| this family is the least useful one. These functions should be avoided |
| whenever possible, especially when developing libraries (as opposed to |
| applications). |
| |
| The second family of functions got introduced in the early Unix |
| standards (XPG2) and is still part of the latest and greatest Unix |
| standard: Unix 98. It is also the most powerful and useful set of |
| functions. But we will start with the functions defined in Amendment 1 |
| to ISO C90. |
| |
| |
| File: libc.info, Node: Restartable multibyte conversion, Next: Non-reentrant Conversion, Prev: Charset Function Overview, Up: Character Set Handling |
| |
| 6.3 Restartable Multibyte Conversion Functions |
| ============================================== |
| |
| The ISO C standard defines functions to convert strings from a multibyte |
| representation to wide character strings. There are a number of |
| peculiarities: |
| |
| * The character set assumed for the multibyte encoding is not |
| specified as an argument to the functions. Instead the character |
| set specified by the 'LC_CTYPE' category of the current locale is |
| used; see *note Locale Categories::. |
| |
| * The functions handling more than one character at a time require |
| NUL terminated strings as the argument (i.e., converting blocks of |
| text does not work unless one can add a NUL byte at an appropriate |
| place). The GNU C Library contains some extensions to the standard |
| that allow specifying a size, but basically they also expect |
| terminated strings. |
| |
| Despite these limitations the ISO C functions can be used in many |
| contexts. In graphical user interfaces, for instance, it is not |
| uncommon to have functions that require text to be displayed in a wide |
| character string if the text is not simple ASCII. The text itself might |
| come from a file with translations and the user should decide about the |
| current locale, which determines the translation and therefore also the |
| external encoding used. In such a situation (and many others) the |
| functions described here are perfect. If more freedom while performing |
| the conversion is necessary take a look at the 'iconv' functions (*note |
| Generic Charset Conversion::). |
| |
| * Menu: |
| |
| * Selecting the Conversion:: Selecting the conversion and its properties. |
| * Keeping the state:: Representing the state of the conversion. |
| * Converting a Character:: Converting Single Characters. |
| * Converting Strings:: Converting Multibyte and Wide Character |
| Strings. |
| * Multibyte Conversion Example:: A Complete Multibyte Conversion Example. |
| |
| |
| File: libc.info, Node: Selecting the Conversion, Next: Keeping the state, Up: Restartable multibyte conversion |
| |
| 6.3.1 Selecting the conversion and its properties |
| ------------------------------------------------- |
| |
| We already said above that the currently selected locale for the |
| 'LC_CTYPE' category decides about the conversion that is performed by |
| the functions we are about to describe. Each locale uses its own |
| character set (given as an argument to 'localedef') and this is the one |
| assumed as the external multibyte encoding. The wide character set is |
| always UCS-4 in the GNU C Library. |
| |
| A characteristic of each multibyte character set is the maximum |
| number of bytes that can be necessary to represent one character. This |
| information is quite important when writing code that uses the |
| conversion functions (as shown in the examples below). The ISO C |
| standard defines two macros that provide this information. |
| |
| -- Macro: int MB_LEN_MAX |
| 'MB_LEN_MAX' specifies the maximum number of bytes in the multibyte |
| sequence for a single character in any of the supported locales. |
| It is a compile-time constant and is defined in 'limits.h'. |
| |
| -- Macro: int MB_CUR_MAX |
| 'MB_CUR_MAX' expands into a positive integer expression that is the |
| maximum number of bytes in a multibyte character in the current |
| locale. The value is never greater than 'MB_LEN_MAX'. Unlike |
| 'MB_LEN_MAX' this macro need not be a compile-time constant, and in |
| the GNU C Library it is not. |
| |
| 'MB_CUR_MAX' is defined in 'stdlib.h'. |
| |
| Two different macros are necessary since strictly ISO C90 compilers |
| do not allow variable length array definitions, but still it is |
| desirable to avoid dynamic allocation. This incomplete piece of code |
| shows the problem: |
| |
| { |
| char buf[MB_LEN_MAX]; |
| ssize_t len = 0; |
| |
| while (! feof (fp)) |
| { |
| fread (&buf[len], 1, MB_CUR_MAX - len, fp); |
| /* ... process buf */ |
| len -= used; |
| } |
| } |
| |
| The code in the inner loop is expected to have always enough bytes in |
| the array BUF to convert one multibyte character. The array BUF has to |
| be sized statically since many compilers do not allow a variable size. |
| The 'fread' call makes sure that 'MB_CUR_MAX' bytes are always available |
| in BUF. Note that it isn't a problem if 'MB_CUR_MAX' is not a |
| compile-time constant. |
| |
| |
| File: libc.info, Node: Keeping the state, Next: Converting a Character, Prev: Selecting the Conversion, Up: Restartable multibyte conversion |
| |
| 6.3.2 Representing the state of the conversion |
| ---------------------------------------------- |
| |
| In the introduction of this chapter it was said that certain character |
| sets use a "stateful" encoding. That is, the encoded values depend in |
| some way on the previous bytes in the text. |
| |
| Since the conversion functions allow converting a text in more than |
| one step we must have a way to pass this information from one call of |
| the functions to another. |
| |
| -- Data type: mbstate_t |
| A variable of type 'mbstate_t' can contain all the information |
| about the "shift state" needed from one call to a conversion |
| function to another. |
| |
| 'mbstate_t' is defined in 'wchar.h'. It was introduced in Amendment 1 |
| to ISO C90. |
| |
| To use objects of type 'mbstate_t' the programmer has to define such |
| objects (normally as local variables on the stack) and pass a pointer to |
| the object to the conversion functions. This way the conversion |
| function can update the object if the current multibyte character set is |
| stateful. |
| |
| There is no specific function or initializer to put the state object |
| in any specific state. The rules are that the object should always |
| represent the initial state before the first use, and this is achieved |
| by clearing the whole variable with code such as follows: |
| |
| { |
| mbstate_t state; |
| memset (&state, '\0', sizeof (state)); |
| /* from now on STATE can be used. */ |
| ... |
| } |
| |
| When using the conversion functions to generate output it is often |
| necessary to test whether the current state corresponds to the initial |
| state. This is necessary, for example, to decide whether to emit escape |
| sequences to set the state to the initial state at certain sequence |
| points. Communication protocols often require this. |
| |
| -- Function: int mbsinit (const mbstate_t *PS) |
| Preliminary: | MT-Safe | AS-Safe | AC-Safe | *Note POSIX Safety |
| Concepts::. |
| |
| The 'mbsinit' function determines whether the state object pointed |
| to by PS is in the initial state. If PS is a null pointer or the |
| object is in the initial state the return value is nonzero. |
| Otherwise it is zero. |
| |
| 'mbsinit' was introduced in Amendment 1 to ISO C90 and is declared |
| in 'wchar.h'. |
| |
| Code using 'mbsinit' often looks similar to this: |
| |
| { |
| mbstate_t state; |
| memset (&state, '\0', sizeof (state)); |
| /* Use STATE. */ |
| ... |
| if (! mbsinit (&state)) |
| { |
| /* Emit code to return to initial state. */ |
| const wchar_t empty[] = L""; |
| const wchar_t *srcp = empty; |
| wcsrtombs (outbuf, &srcp, outbuflen, &state); |
| } |
| ... |
| } |
| |
| The code to emit the escape sequence to get back to the initial state |
| is interesting. The 'wcsrtombs' function can be used to determine the |
| necessary output code (*note Converting Strings::). Please note that |
| with the GNU C Library it is not necessary to perform this extra action |
| for the conversion from multibyte text to wide character text since the |
| wide character encoding is not stateful. But there is nothing mentioned |
| in any standard that prohibits making 'wchar_t' using a stateful |
| encoding. |
| |
| |
| File: libc.info, Node: Converting a Character, Next: Converting Strings, Prev: Keeping the state, Up: Restartable multibyte conversion |
| |
| 6.3.3 Converting Single Characters |
| ---------------------------------- |
| |
| The most fundamental of the conversion functions are those dealing with |
| single characters. Please note that this does not always mean single |
| bytes. But since there is very often a subset of the multibyte |
| character set that consists of single byte sequences, there are |
| functions to help with converting bytes. Frequently, ASCII is a subpart |
| of the multibyte character set. In such a scenario, each ASCII |
| character stands for itself, and all other characters have at least a |
| first byte that is beyond the range 0 to 127. |
| |
| -- Function: wint_t btowc (int C) |
| Preliminary: | MT-Safe | AS-Unsafe corrupt heap lock dlopen | |
| AC-Unsafe corrupt lock mem fd | *Note POSIX Safety Concepts::. |
| |
| The 'btowc' function ("byte to wide character") converts a valid |
| single byte character C in the initial shift state into the wide |
| character equivalent using the conversion rules from the currently |
| selected locale of the 'LC_CTYPE' category. |
| |
| If '(unsigned char) C' is no valid single byte multibyte character |
| or if C is 'EOF', the function returns 'WEOF'. |
| |
| Please note the restriction of C being tested for validity only in |
| the initial shift state. No 'mbstate_t' object is used from which |
| the state information is taken, and the function also does not use |
| any static state. |
| |
| The 'btowc' function was introduced in Amendment 1 to ISO C90 and |
| is declared in 'wchar.h'. |
| |
| Despite the limitation that the single byte value is always |
| interpreted in the initial state, this function is actually useful most |
| of the time. Most characters are either entirely single-byte character |
| sets or they are extension to ASCII. But then it is possible to write |
| code like this (not that this specific example is very useful): |
| |
| wchar_t * |
| itow (unsigned long int val) |
| { |
| static wchar_t buf[30]; |
| wchar_t *wcp = &buf[29]; |
| *wcp = L'\0'; |
| while (val != 0) |
| { |
| *--wcp = btowc ('0' + val % 10); |
| val /= 10; |
| } |
| if (wcp == &buf[29]) |
| *--wcp = L'0'; |
| return wcp; |
| } |
| |
| Why is it necessary to use such a complicated implementation and not |
| simply cast ''0' + val % 10' to a wide character? The answer is that |
| there is no guarantee that one can perform this kind of arithmetic on |
| the character of the character set used for 'wchar_t' representation. |
| In other situations the bytes are not constant at compile time and so |
| the compiler cannot do the work. In situations like this, using 'btowc' |
| is required. |
| |
| There is also a function for the conversion in the other direction. |
| |
| -- Function: int wctob (wint_t C) |
| Preliminary: | MT-Safe | AS-Unsafe corrupt heap lock dlopen | |
| AC-Unsafe corrupt lock mem fd | *Note POSIX Safety Concepts::. |
| |
| The 'wctob' function ("wide character to byte") takes as the |
| parameter a valid wide character. If the multibyte representation |
| for this character in the initial state is exactly one byte long, |
| the return value of this function is this character. Otherwise the |
| return value is 'EOF'. |
| |
| 'wctob' was introduced in Amendment 1 to ISO C90 and is declared in |
| 'wchar.h'. |
| |
| There are more general functions to convert single character from |
| multibyte representation to wide characters and vice versa. These |
| functions pose no limit on the length of the multibyte representation |
| and they also do not require it to be in the initial state. |
| |
| -- Function: size_t mbrtowc (wchar_t *restrict PWC, const char |
| *restrict S, size_t N, mbstate_t *restrict PS) |
| Preliminary: | MT-Unsafe race:mbrtowc/!ps | AS-Unsafe corrupt heap |
| lock dlopen | AC-Unsafe corrupt lock mem fd | *Note POSIX Safety |
| Concepts::. |
| |
| The 'mbrtowc' function ("multibyte restartable to wide character") |
| converts the next multibyte character in the string pointed to by S |
| into a wide character and stores it in the wide character string |
| pointed to by PWC. The conversion is performed according to the |
| locale currently selected for the 'LC_CTYPE' category. If the |
| conversion for the character set used in the locale requires a |
| state, the multibyte string is interpreted in the state represented |
| by the object pointed to by PS. If PS is a null pointer, a static, |
| internal state variable used only by the 'mbrtowc' function is |
| used. |
| |
| If the next multibyte character corresponds to the NUL wide |
| character, the return value of the function is 0 and the state |
| object is afterwards in the initial state. If the next N or fewer |
| bytes form a correct multibyte character, the return value is the |
| number of bytes starting from S that form the multibyte character. |
| The conversion state is updated according to the bytes consumed in |
| the conversion. In both cases the wide character (either the |
| 'L'\0'' or the one found in the conversion) is stored in the string |
| pointed to by PWC if PWC is not null. |
| |
| If the first N bytes of the multibyte string possibly form a valid |
| multibyte character but there are more than N bytes needed to |
| complete it, the return value of the function is '(size_t) -2' and |
| no value is stored. Please note that this can happen even if N has |
| a value greater than or equal to 'MB_CUR_MAX' since the input might |
| contain redundant shift sequences. |
| |
| If the first 'n' bytes of the multibyte string cannot possibly form |
| a valid multibyte character, no value is stored, the global |
| variable 'errno' is set to the value 'EILSEQ', and the function |
| returns '(size_t) -1'. The conversion state is afterwards |
| undefined. |
| |
| 'mbrtowc' was introduced in Amendment 1 to ISO C90 and is declared |
| in 'wchar.h'. |
| |
| Use of 'mbrtowc' is straightforward. A function that copies a |
| multibyte string into a wide character string while at the same time |
| converting all lowercase characters into uppercase could look like this |
| (this is not the final version, just an example; it has no error |
| checking, and sometimes leaks memory): |
| |
| wchar_t * |
| mbstouwcs (const char *s) |
| { |
| size_t len = strlen (s); |
| wchar_t *result = malloc ((len + 1) * sizeof (wchar_t)); |
| wchar_t *wcp = result; |
| wchar_t tmp[1]; |
| mbstate_t state; |
| size_t nbytes; |
| |
| memset (&state, '\0', sizeof (state)); |
| while ((nbytes = mbrtowc (tmp, s, len, &state)) > 0) |
| { |
| if (nbytes >= (size_t) -2) |
| /* Invalid input string. */ |
| return NULL; |
| *wcp++ = towupper (tmp[0]); |
| len -= nbytes; |
| s += nbytes; |
| } |
| return result; |
| } |
| |
| The use of 'mbrtowc' should be clear. A single wide character is |
| stored in 'TMP[0]', and the number of consumed bytes is stored in the |
| variable NBYTES. If the conversion is successful, the uppercase variant |
| of the wide character is stored in the RESULT array and the pointer to |
| the input string and the number of available bytes is adjusted. |
| |
| The only non-obvious thing about 'mbrtowc' might be the way memory is |
| allocated for the result. The above code uses the fact that there can |
| never be more wide characters in the converted results than there are |
| bytes in the multibyte input string. This method yields a pessimistic |
| guess about the size of the result, and if many wide character strings |
| have to be constructed this way or if the strings are long, the extra |
| memory required to be allocated because the input string contains |
| multibyte characters might be significant. The allocated memory block |
| can be resized to the correct size before returning it, but a better |
| solution might be to allocate just the right amount of space for the |
| result right away. Unfortunately there is no function to compute the |
| length of the wide character string directly from the multibyte string. |
| There is, however, a function that does part of the work. |
| |
| -- Function: size_t mbrlen (const char *restrict S, size_t N, mbstate_t |
| *PS) |
| Preliminary: | MT-Unsafe race:mbrlen/!ps | AS-Unsafe corrupt heap |
| lock dlopen | AC-Unsafe corrupt lock mem fd | *Note POSIX Safety |
| Concepts::. |
| |
| The 'mbrlen' function ("multibyte restartable length") computes the |
| number of at most N bytes starting at S, which form the next valid |
| and complete multibyte character. |
| |
| If the next multibyte character corresponds to the NUL wide |
| character, the return value is 0. If the next N bytes form a valid |
| multibyte character, the number of bytes belonging to this |
| multibyte character byte sequence is returned. |
| |
| If the first N bytes possibly form a valid multibyte character but |
| the character is incomplete, the return value is '(size_t) -2'. |
| Otherwise the multibyte character sequence is invalid and the |
| return value is '(size_t) -1'. |
| |
| The multibyte sequence is interpreted in the state represented by |
| the object pointed to by PS. If PS is a null pointer, a state |
| object local to 'mbrlen' is used. |
| |
| 'mbrlen' was introduced in Amendment 1 to ISO C90 and is declared |
| in 'wchar.h'. |
| |
| The attentive reader now will note that 'mbrlen' can be implemented |
| as |
| |
| mbrtowc (NULL, s, n, ps != NULL ? ps : &internal) |
| |
| This is true and in fact is mentioned in the official specification. |
| How can this function be used to determine the length of the wide |
| character string created from a multibyte character string? It is not |
| directly usable, but we can define a function 'mbslen' using it: |
| |
| size_t |
| mbslen (const char *s) |
| { |
| mbstate_t state; |
| size_t result = 0; |
| size_t nbytes; |
| memset (&state, '\0', sizeof (state)); |
| while ((nbytes = mbrlen (s, MB_LEN_MAX, &state)) > 0) |
| { |
| if (nbytes >= (size_t) -2) |
| /* Something is wrong. */ |
| return (size_t) -1; |
| s += nbytes; |
| ++result; |
| } |
| return result; |
| } |
| |
| This function simply calls 'mbrlen' for each multibyte character in |
| the string and counts the number of function calls. Please note that we |
| here use 'MB_LEN_MAX' as the size argument in the 'mbrlen' call. This |
| is acceptable since a) this value is larger than the length of the |
| longest multibyte character sequence and b) we know that the string S |
| ends with a NUL byte, which cannot be part of any other multibyte |
| character sequence but the one representing the NUL wide character. |
| Therefore, the 'mbrlen' function will never read invalid memory. |
| |
| Now that this function is available (just to make this clear, this |
| function is _not_ part of the GNU C Library) we can compute the number |
| of wide character required to store the converted multibyte character |
| string S using |
| |
| wcs_bytes = (mbslen (s) + 1) * sizeof (wchar_t); |
| |
| Please note that the 'mbslen' function is quite inefficient. The |
| implementation of 'mbstouwcs' with 'mbslen' would have to perform the |
| conversion of the multibyte character input string twice, and this |
| conversion might be quite expensive. So it is necessary to think about |
| the consequences of using the easier but imprecise method before doing |
| the work twice. |
| |
| -- Function: size_t wcrtomb (char *restrict S, wchar_t WC, mbstate_t |
| *restrict PS) |
| Preliminary: | MT-Unsafe race:wcrtomb/!ps | AS-Unsafe corrupt heap |
| lock dlopen | AC-Unsafe corrupt lock mem fd | *Note POSIX Safety |
| Concepts::. |
| |
| The 'wcrtomb' function ("wide character restartable to multibyte") |
| converts a single wide character into a multibyte string |
| corresponding to that wide character. |
| |
| If S is a null pointer, the function resets the state stored in the |
| objects pointed to by PS (or the internal 'mbstate_t' object) to |
| the initial state. This can also be achieved by a call like this: |
| |
| wcrtombs (temp_buf, L'\0', ps) |
| |
| since, if S is a null pointer, 'wcrtomb' performs as if it writes |
| into an internal buffer, which is guaranteed to be large enough. |
| |
| If WC is the NUL wide character, 'wcrtomb' emits, if necessary, a |
| shift sequence to get the state PS into the initial state followed |
| by a single NUL byte, which is stored in the string S. |
| |
| Otherwise a byte sequence (possibly including shift sequences) is |
| written into the string S. This only happens if WC is a valid wide |
| character (i.e., it has a multibyte representation in the character |
| set selected by locale of the 'LC_CTYPE' category). If WC is no |
| valid wide character, nothing is stored in the strings S, 'errno' |
| is set to 'EILSEQ', the conversion state in PS is undefined and the |
| return value is '(size_t) -1'. |
| |
| If no error occurred the function returns the number of bytes |
| stored in the string S. This includes all bytes representing shift |
| sequences. |
| |
| One word about the interface of the function: there is no parameter |
| specifying the length of the array S. Instead the function assumes |
| that there are at least 'MB_CUR_MAX' bytes available since this is |
| the maximum length of any byte sequence representing a single |
| character. So the caller has to make sure that there is enough |
| space available, otherwise buffer overruns can occur. |
| |
| 'wcrtomb' was introduced in Amendment 1 to ISO C90 and is declared |
| in 'wchar.h'. |
| |
| Using 'wcrtomb' is as easy as using 'mbrtowc'. The following example |
| appends a wide character string to a multibyte character string. Again, |
| the code is not really useful (or correct), it is simply here to |
| demonstrate the use and some problems. |
| |
| char * |
| mbscatwcs (char *s, size_t len, const wchar_t *ws) |
| { |
| mbstate_t state; |
| /* Find the end of the existing string. */ |
| char *wp = strchr (s, '\0'); |
| len -= wp - s; |
| memset (&state, '\0', sizeof (state)); |
| do |
| { |
| size_t nbytes; |
| if (len < MB_CUR_LEN) |
| { |
| /* We cannot guarantee that the next |
| character fits into the buffer, so |
| return an error. */ |
| errno = E2BIG; |
| return NULL; |
| } |
| nbytes = wcrtomb (wp, *ws, &state); |
| if (nbytes == (size_t) -1) |
| /* Error in the conversion. */ |
| return NULL; |
| len -= nbytes; |
| wp += nbytes; |
| } |
| while (*ws++ != L'\0'); |
| return s; |
| } |
| |
| First the function has to find the end of the string currently in the |
| array S. The 'strchr' call does this very efficiently since a |
| requirement for multibyte character representations is that the NUL byte |
| is never used except to represent itself (and in this context, the end |
| of the string). |
| |
| After initializing the state object the loop is entered where the |
| first task is to make sure there is enough room in the array S. We |
| abort if there are not at least 'MB_CUR_LEN' bytes available. This is |
| not always optimal but we have no other choice. We might have less than |
| 'MB_CUR_LEN' bytes available but the next multibyte character might also |
| be only one byte long. At the time the 'wcrtomb' call returns it is too |
| late to decide whether the buffer was large enough. If this solution is |
| unsuitable, there is a very slow but more accurate solution. |
| |
| ... |
| if (len < MB_CUR_LEN) |
| { |
| mbstate_t temp_state; |
| memcpy (&temp_state, &state, sizeof (state)); |
| if (wcrtomb (NULL, *ws, &temp_state) > len) |
| { |
| /* We cannot guarantee that the next |
| character fits into the buffer, so |
| return an error. */ |
| errno = E2BIG; |
| return NULL; |
| } |
| } |
| ... |
| |
| Here we perform the conversion that might overflow the buffer so that |
| we are afterwards in the position to make an exact decision about the |
| buffer size. Please note the 'NULL' argument for the destination buffer |
| in the new 'wcrtomb' call; since we are not interested in the converted |
| text at this point, this is a nice way to express this. The most |
| unusual thing about this piece of code certainly is the duplication of |
| the conversion state object, but if a change of the state is necessary |
| to emit the next multibyte character, we want to have the same shift |
| state change performed in the real conversion. Therefore, we have to |
| preserve the initial shift state information. |
| |
| There are certainly many more and even better solutions to this |
| problem. This example is only provided for educational purposes. |
| |
| |
| File: libc.info, Node: Converting Strings, Next: Multibyte Conversion Example, Prev: Converting a Character, Up: Restartable multibyte conversion |
| |
| 6.3.4 Converting Multibyte and Wide Character Strings |
| ----------------------------------------------------- |
| |
| The functions described in the previous section only convert a single |
| character at a time. Most operations to be performed in real-world |
| programs include strings and therefore the ISO C standard also defines |
| conversions on entire strings. However, the defined set of functions is |
| quite limited; therefore, the GNU C Library contains a few extensions |
| that can help in some important situations. |
| |
| -- Function: size_t mbsrtowcs (wchar_t *restrict DST, const char |
| **restrict SRC, size_t LEN, mbstate_t *restrict PS) |
| Preliminary: | MT-Unsafe race:mbsrtowcs/!ps | AS-Unsafe corrupt |
| heap lock dlopen | AC-Unsafe corrupt lock mem fd | *Note POSIX |
| Safety Concepts::. |
| |
| The 'mbsrtowcs' function ("multibyte string restartable to wide |
| character string") converts a NUL-terminated multibyte character |
| string at '*SRC' into an equivalent wide character string, |
| including the NUL wide character at the end. The conversion is |
| started using the state information from the object pointed to by |
| PS or from an internal object of 'mbsrtowcs' if PS is a null |
| pointer. Before returning, the state object is updated to match |
| the state after the last converted character. The state is the |
| initial state if the terminating NUL byte is reached and converted. |
| |
| If DST is not a null pointer, the result is stored in the array |
| pointed to by DST; otherwise, the conversion result is not |
| available since it is stored in an internal buffer. |
| |
| If LEN wide characters are stored in the array DST before reaching |
| the end of the input string, the conversion stops and LEN is |
| returned. If DST is a null pointer, LEN is never checked. |
| |
| Another reason for a premature return from the function call is if |
| the input string contains an invalid multibyte sequence. In this |
| case the global variable 'errno' is set to 'EILSEQ' and the |
| function returns '(size_t) -1'. |
| |
| In all other cases the function returns the number of wide |
| characters converted during this call. If DST is not null, |
| 'mbsrtowcs' stores in the pointer pointed to by SRC either a null |
| pointer (if the NUL byte in the input string was reached) or the |
| address of the byte following the last converted multibyte |
| character. |
| |
| 'mbsrtowcs' was introduced in Amendment 1 to ISO C90 and is |
| declared in 'wchar.h'. |
| |
| The definition of the 'mbsrtowcs' function has one important |
| limitation. The requirement that DST has to be a NUL-terminated string |
| provides problems if one wants to convert buffers with text. A buffer |
| is normally no collection of NUL-terminated strings but instead a |
| continuous collection of lines, separated by newline characters. Now |
| assume that a function to convert one line from a buffer is needed. |
| Since the line is not NUL-terminated, the source pointer cannot directly |
| point into the unmodified text buffer. This means, either one inserts |
| the NUL byte at the appropriate place for the time of the 'mbsrtowcs' |
| function call (which is not doable for a read-only buffer or in a |
| multi-threaded application) or one copies the line in an extra buffer |
| where it can be terminated by a NUL byte. Note that it is not in |
| general possible to limit the number of characters to convert by setting |
| the parameter LEN to any specific value. Since it is not known how many |
| bytes each multibyte character sequence is in length, one can only |
| guess. |
| |
| There is still a problem with the method of NUL-terminating a line |
| right after the newline character, which could lead to very strange |
| results. As said in the description of the 'mbsrtowcs' function above |
| the conversion state is guaranteed to be in the initial shift state |
| after processing the NUL byte at the end of the input string. But this |
| NUL byte is not really part of the text (i.e., the conversion state |
| after the newline in the original text could be something different than |
| the initial shift state and therefore the first character of the next |
| line is encoded using this state). But the state in question is never |
| accessible to the user since the conversion stops after the NUL byte |
| (which resets the state). Most stateful character sets in use today |
| require that the shift state after a newline be the initial state-but |
| this is not a strict guarantee. Therefore, simply NUL-terminating a |
| piece of a running text is not always an adequate solution and, |
| therefore, should never be used in generally used code. |
| |
| The generic conversion interface (*note Generic Charset Conversion::) |
| does not have this limitation (it simply works on buffers, not strings), |
| and the GNU C Library contains a set of functions that take additional |
| parameters specifying the maximal number of bytes that are consumed from |
| the input string. This way the problem of 'mbsrtowcs''s example above |
| could be solved by determining the line length and passing this length |
| to the function. |
| |
| -- Function: size_t wcsrtombs (char *restrict DST, const wchar_t |
| **restrict SRC, size_t LEN, mbstate_t *restrict PS) |
| Preliminary: | MT-Unsafe race:wcsrtombs/!ps | AS-Unsafe corrupt |
| heap lock dlopen | AC-Unsafe corrupt lock mem fd | *Note POSIX |
| Safety Concepts::. |
| |
| The 'wcsrtombs' function ("wide character string restartable to |
| multibyte string") converts the NUL-terminated wide character |
| string at '*SRC' into an equivalent multibyte character string and |
| stores the result in the array pointed to by DST. The NUL wide |
| character is also converted. The conversion starts in the state |
| described in the object pointed to by PS or by a state object |
| locally to 'wcsrtombs' in case PS is a null pointer. If DST is a |
| null pointer, the conversion is performed as usual but the result |
| is not available. If all characters of the input string were |
| successfully converted and if DST is not a null pointer, the |
| pointer pointed to by SRC gets assigned a null pointer. |
| |
| If one of the wide characters in the input string has no valid |
| multibyte character equivalent, the conversion stops early, sets |
| the global variable 'errno' to 'EILSEQ', and returns '(size_t) -1'. |
| |
| Another reason for a premature stop is if DST is not a null pointer |
| and the next converted character would require more than LEN bytes |
| in total to the array DST. In this case (and if DEST is not a null |
| pointer) the pointer pointed to by SRC is assigned a value pointing |
| to the wide character right after the last one successfully |
| converted. |
| |
| Except in the case of an encoding error the return value of the |
| 'wcsrtombs' function is the number of bytes in all the multibyte |
| character sequences stored in DST. Before returning the state in |
| the object pointed to by PS (or the internal object in case PS is a |
| null pointer) is updated to reflect the state after the last |
| conversion. The state is the initial shift state in case the |
| terminating NUL wide character was converted. |
| |
| The 'wcsrtombs' function was introduced in Amendment 1 to ISO C90 |
| and is declared in 'wchar.h'. |
| |
| The restriction mentioned above for the 'mbsrtowcs' function applies |
| here also. There is no possibility of directly controlling the number |
| of input characters. One has to place the NUL wide character at the |
| correct place or control the consumed input indirectly via the available |
| output array size (the LEN parameter). |
| |
| -- Function: size_t mbsnrtowcs (wchar_t *restrict DST, const char |
| **restrict SRC, size_t NMC, size_t LEN, mbstate_t *restrict |
| PS) |
| Preliminary: | MT-Unsafe race:mbsnrtowcs/!ps | AS-Unsafe corrupt |
| heap lock dlopen | AC-Unsafe corrupt lock mem fd | *Note POSIX |
| Safety Concepts::. |
| |
| The 'mbsnrtowcs' function is very similar to the 'mbsrtowcs' |
| function. All the parameters are the same except for NMC, which is |
| new. The return value is the same as for 'mbsrtowcs'. |
| |
| This new parameter specifies how many bytes at most can be used |
| from the multibyte character string. In other words, the multibyte |
| character string '*SRC' need not be NUL-terminated. But if a NUL |
| byte is found within the NMC first bytes of the string, the |
| conversion stops here. |
| |
| This function is a GNU extension. It is meant to work around the |
| problems mentioned above. Now it is possible to convert a buffer |
| with multibyte character text piece for piece without having to |
| care about inserting NUL bytes and the effect of NUL bytes on the |
| conversion state. |
| |
| A function to convert a multibyte string into a wide character string |
| and display it could be written like this (this is not a really useful |
| example): |
| |
| void |
| showmbs (const char *src, FILE *fp) |
| { |
| mbstate_t state; |
| int cnt = 0; |
| memset (&state, '\0', sizeof (state)); |
| while (1) |
| { |
| wchar_t linebuf[100]; |
| const char *endp = strchr (src, '\n'); |
| size_t n; |
| |
| /* Exit if there is no more line. */ |
| if (endp == NULL) |
| break; |
| |
| n = mbsnrtowcs (linebuf, &src, endp - src, 99, &state); |
| linebuf[n] = L'\0'; |
| fprintf (fp, "line %d: \"%S\"\n", linebuf); |
| } |
| } |
| |
| There is no problem with the state after a call to 'mbsnrtowcs'. |
| Since we don't insert characters in the strings that were not in there |
| right from the beginning and we use STATE only for the conversion of the |
| given buffer, there is no problem with altering the state. |
| |
| -- Function: size_t wcsnrtombs (char *restrict DST, const wchar_t |
| **restrict SRC, size_t NWC, size_t LEN, mbstate_t *restrict |
| PS) |
| Preliminary: | MT-Unsafe race:wcsnrtombs/!ps | AS-Unsafe corrupt |
| heap lock dlopen | AC-Unsafe corrupt lock mem fd | *Note POSIX |
| Safety Concepts::. |
| |
| The 'wcsnrtombs' function implements the conversion from wide |
| character strings to multibyte character strings. It is similar to |
| 'wcsrtombs' but, just like 'mbsnrtowcs', it takes an extra |
| parameter, which specifies the length of the input string. |
| |
| No more than NWC wide characters from the input string '*SRC' are |
| converted. If the input string contains a NUL wide character in |
| the first NWC characters, the conversion stops at this place. |
| |
| The 'wcsnrtombs' function is a GNU extension and just like |
| 'mbsnrtowcs' helps in situations where no NUL-terminated input |
| strings are available. |
| |
| |
| File: libc.info, Node: Multibyte Conversion Example, Prev: Converting Strings, Up: Restartable multibyte conversion |
| |
| 6.3.5 A Complete Multibyte Conversion Example |
| --------------------------------------------- |
| |
| The example programs given in the last sections are only brief and do |
| not contain all the error checking, etc. Presented here is a complete |
| and documented example. It features the 'mbrtowc' function but it |
| should be easy to derive versions using the other functions. |
| |
| int |
| file_mbsrtowcs (int input, int output) |
| { |
| /* Note the use of 'MB_LEN_MAX'. |
| 'MB_CUR_MAX' cannot portably be used here. */ |
| char buffer[BUFSIZ + MB_LEN_MAX]; |
| mbstate_t state; |
| int filled = 0; |
| int eof = 0; |
| |
| /* Initialize the state. */ |
| memset (&state, '\0', sizeof (state)); |
| |
| while (!eof) |
| { |
| ssize_t nread; |
| ssize_t nwrite; |
| char *inp = buffer; |
| wchar_t outbuf[BUFSIZ]; |
| wchar_t *outp = outbuf; |
| |
| /* Fill up the buffer from the input file. */ |
| nread = read (input, buffer + filled, BUFSIZ); |
| if (nread < 0) |
| { |
| perror ("read"); |
| return 0; |
| } |
| /* If we reach end of file, make a note to read no more. */ |
| if (nread == 0) |
| eof = 1; |
| |
| /* 'filled' is now the number of bytes in 'buffer'. */ |
| filled += nread; |
| |
| /* Convert those bytes to wide characters-as many as we can. */ |
| while (1) |
| { |
| size_t thislen = mbrtowc (outp, inp, filled, &state); |
| /* Stop converting at invalid character; |
| this can mean we have read just the first part |
| of a valid character. */ |
| if (thislen == (size_t) -1) |
| break; |
| /* We want to handle embedded NUL bytes |
| but the return value is 0. Correct this. */ |
| if (thislen == 0) |
| thislen = 1; |
| /* Advance past this character. */ |
| inp += thislen; |
| filled -= thislen; |
| ++outp; |
| } |
| |
| /* Write the wide characters we just made. */ |
| nwrite = write (output, outbuf, |
| (outp - outbuf) * sizeof (wchar_t)); |
| if (nwrite < 0) |
| { |
| perror ("write"); |
| return 0; |
| } |
| |
| /* See if we have a _real_ invalid character. */ |
| if ((eof && filled > 0) || filled >= MB_CUR_MAX) |
| { |
| error (0, 0, "invalid multibyte character"); |
| return 0; |
| } |
| |
| /* If any characters must be carried forward, |
| put them at the beginning of 'buffer'. */ |
| if (filled > 0) |
| memmove (buffer, inp, filled); |
| } |
| |
| return 1; |
| } |
| |
| |
| File: libc.info, Node: Non-reentrant Conversion, Next: Generic Charset Conversion, Prev: Restartable multibyte conversion, Up: Character Set Handling |
| |
| 6.4 Non-reentrant Conversion Function |
| ===================================== |
| |
| The functions described in the previous chapter are defined in Amendment 1 |
| to ISO C90, but the original ISO C90 standard also contained functions |
| for character set conversion. The reason that these original functions |
| are not described first is that they are almost entirely useless. |
| |
| The problem is that all the conversion functions described in the |
| original ISO C90 use a local state. Using a local state implies that |
| multiple conversions at the same time (not only when using threads) |
| cannot be done, and that you cannot first convert single characters and |
| then strings since you cannot tell the conversion functions which state |
| to use. |
| |
| These original functions are therefore usable only in a very limited |
| set of situations. One must complete converting the entire string |
| before starting a new one, and each string/text must be converted with |
| the same function (there is no problem with the library itself; it is |
| guaranteed that no library function changes the state of any of these |
| functions). *For the above reasons it is highly requested that the |
| functions described in the previous section be used in place of |
| non-reentrant conversion functions.* |
| |
| * Menu: |
| |
| * Non-reentrant Character Conversion:: Non-reentrant Conversion of Single |
| Characters. |
| * Non-reentrant String Conversion:: Non-reentrant Conversion of Strings. |
| * Shift State:: States in Non-reentrant Functions. |
| |
| |
| File: libc.info, Node: Non-reentrant Character Conversion, Next: Non-reentrant String Conversion, Up: Non-reentrant Conversion |
| |
| 6.4.1 Non-reentrant Conversion of Single Characters |
| --------------------------------------------------- |
| |
| -- Function: int mbtowc (wchar_t *restrict RESULT, const char *restrict |
| STRING, size_t SIZE) |
| Preliminary: | MT-Unsafe race | AS-Unsafe corrupt heap lock dlopen |
| | AC-Unsafe corrupt lock mem fd | *Note POSIX Safety Concepts::. |
| |
| The 'mbtowc' ("multibyte to wide character") function when called |
| with non-null STRING converts the first multibyte character |
| beginning at STRING to its corresponding wide character code. It |
| stores the result in '*RESULT'. |
| |
| 'mbtowc' never examines more than SIZE bytes. (The idea is to |
| supply for SIZE the number of bytes of data you have in hand.) |
| |
| 'mbtowc' with non-null STRING distinguishes three possibilities: |
| the first SIZE bytes at STRING start with valid multibyte |
| characters, they start with an invalid byte sequence or just part |
| of a character, or STRING points to an empty string (a null |
| character). |
| |
| For a valid multibyte character, 'mbtowc' converts it to a wide |
| character and stores that in '*RESULT', and returns the number of |
| bytes in that character (always at least 1 and never more than |
| SIZE). |
| |
| For an invalid byte sequence, 'mbtowc' returns -1. For an empty |
| string, it returns 0, also storing ''\0'' in '*RESULT'. |
| |
| If the multibyte character code uses shift characters, then |
| 'mbtowc' maintains and updates a shift state as it scans. If you |
| call 'mbtowc' with a null pointer for STRING, that initializes the |
| shift state to its standard initial value. It also returns nonzero |
| if the multibyte character code in use actually has a shift state. |
| *Note Shift State::. |
| |
| -- Function: int wctomb (char *STRING, wchar_t WCHAR) |
| Preliminary: | MT-Unsafe race | AS-Unsafe corrupt heap lock dlopen |
| | AC-Unsafe corrupt lock mem fd | *Note POSIX Safety Concepts::. |
| |
| The 'wctomb' ("wide character to multibyte") function converts the |
| wide character code WCHAR to its corresponding multibyte character |
| sequence, and stores the result in bytes starting at STRING. At |
| most 'MB_CUR_MAX' characters are stored. |
| |
| 'wctomb' with non-null STRING distinguishes three possibilities for |
| WCHAR: a valid wide character code (one that can be translated to a |
| multibyte character), an invalid code, and 'L'\0''. |
| |
| Given a valid code, 'wctomb' converts it to a multibyte character, |
| storing the bytes starting at STRING. Then it returns the number |
| of bytes in that character (always at least 1 and never more than |
| 'MB_CUR_MAX'). |
| |
| If WCHAR is an invalid wide character code, 'wctomb' returns -1. |
| If WCHAR is 'L'\0'', it returns '0', also storing ''\0'' in |
| '*STRING'. |
| |
| If the multibyte character code uses shift characters, then |
| 'wctomb' maintains and updates a shift state as it scans. If you |
| call 'wctomb' with a null pointer for STRING, that initializes the |
| shift state to its standard initial value. It also returns nonzero |
| if the multibyte character code in use actually has a shift state. |
| *Note Shift State::. |
| |
| Calling this function with a WCHAR argument of zero when STRING is |
| not null has the side-effect of reinitializing the stored shift |
| state _as well as_ storing the multibyte character ''\0'' and |
| returning 0. |
| |
| Similar to 'mbrlen' there is also a non-reentrant function that |
| computes the length of a multibyte character. It can be defined in |
| terms of 'mbtowc'. |
| |
| -- Function: int mblen (const char *STRING, size_t SIZE) |
| Preliminary: | MT-Unsafe race | AS-Unsafe corrupt heap lock dlopen |
| | AC-Unsafe corrupt lock mem fd | *Note POSIX Safety Concepts::. |
| |
| The 'mblen' function with a non-null STRING argument returns the |
| number of bytes that make up the multibyte character beginning at |
| STRING, never examining more than SIZE bytes. (The idea is to |
| supply for SIZE the number of bytes of data you have in hand.) |
| |
| The return value of 'mblen' distinguishes three possibilities: the |
| first SIZE bytes at STRING start with valid multibyte characters, |
| they start with an invalid byte sequence or just part of a |
| character, or STRING points to an empty string (a null character). |
| |
| For a valid multibyte character, 'mblen' returns the number of |
| bytes in that character (always at least '1' and never more than |
| SIZE). For an invalid byte sequence, 'mblen' returns -1. For an |
| empty string, it returns 0. |
| |
| If the multibyte character code uses shift characters, then 'mblen' |
| maintains and updates a shift state as it scans. If you call |
| 'mblen' with a null pointer for STRING, that initializes the shift |
| state to its standard initial value. It also returns a nonzero |
| value if the multibyte character code in use actually has a shift |
| state. *Note Shift State::. |
| |
| The function 'mblen' is declared in 'stdlib.h'. |
| |