commit 0d7224a6d8a77e5eebf5e18bded742490f3b20fd
parent c0f2b512a055c667cb751ef4526ea744f2428826
Author: Steven G. Johnson <stevenj@mit.edu>
Date: Tue, 15 Jul 2014 16:04:36 -0400
markdown and other cosmetic updates
Diffstat:
A | .gitignore | | | 10 | ++++++++++ |
D | LICENSE | | | 64 | ---------------------------------------------------------------- |
A | LICENSE.md | | | 93 | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
M | Makefile | | | 41 | +++-------------------------------------- |
D | README | | | 63 | --------------------------------------------------------------- |
A | README.md | | | 68 | ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
6 files changed, 174 insertions(+), 165 deletions(-)
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,10 @@
+*.tar.gz
+*.exe
+*.dll
+*.do
+*.o
+*.so
+*.a
+*.dll
+*.dylib
+*.dSYM
diff --git a/LICENSE b/LICENSE
@@ -1,64 +0,0 @@
-
-Copyright (c) 2009, 2013 Public Software Group e. V., Berlin, Germany
-
-Permission is hereby granted, free of charge, to any person obtaining a
-copy of this software and associated documentation files (the "Software"),
-to deal in the Software without restriction, including without limitation
-the rights to use, copy, modify, merge, publish, distribute, sublicense,
-and/or sell copies of the Software, and to permit persons to whom the
-Software is furnished to do so, subject to the following conditions:
-
-The above copyright notice and this permission notice shall be included in
-all copies or substantial portions of the Software.
-
-THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
-IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
-FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
-AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
-LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
-FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
-DEALINGS IN THE SOFTWARE.
-
-
-This software distribution contains derived data from a modified version of
-the Unicode data files. The following license applies to that data:
-
-COPYRIGHT AND PERMISSION NOTICE
-
-Copyright (c) 1991-2007 Unicode, Inc. All rights reserved. Distributed
-under the Terms of Use in http://www.unicode.org/copyright.html.
-
-Permission is hereby granted, free of charge, to any person obtaining a
-copy of the Unicode data files and any associated documentation (the "Data
-Files") or Unicode software and any associated documentation (the
-"Software") to deal in the Data Files or Software without restriction,
-including without limitation the rights to use, copy, modify, merge,
-publish, distribute, and/or sell copies of the Data Files or Software, and
-to permit persons to whom the Data Files or Software are furnished to do
-so, provided that (a) the above copyright notice(s) and this permission
-notice appear with all copies of the Data Files or Software, (b) both the
-above copyright notice(s) and this permission notice appear in associated
-documentation, and (c) there is clear notice in each modified Data File or
-in the Software as well as in the documentation associated with the Data
-File(s) or Software that the data or software has been modified.
-
-THE DATA FILES AND SOFTWARE ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
-KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
-MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OF
-THIRD PARTY RIGHTS. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR HOLDERS
-INCLUDED IN THIS NOTICE BE LIABLE FOR ANY CLAIM, OR ANY SPECIAL INDIRECT OR
-CONSEQUENTIAL DAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF
-USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER
-TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
-PERFORMANCE OF THE DATA FILES OR SOFTWARE.
-
-Except as contained in this notice, the name of a copyright holder shall
-not be used in advertising or otherwise to promote the sale, use or other
-dealings in these Data Files or Software without prior written
-authorization of the copyright holder.
-
-
-Unicode and the Unicode logo are trademarks of Unicode, Inc., and may be
-registered in some jurisdictions. All other trademarks and registered
-trademarks mentioned herein are the property of their respective owners.
-
diff --git a/LICENSE.md b/LICENSE.md
@@ -0,0 +1,93 @@
+== libutf8proc license ==
+
+**libutf8proc** is a lightly updated version of the **utf8proc**
+library by Jan Behrens and the rest of the Public Software Group, who
+deserve nearly all of the credit for this library. Like utf8proc,
+whose copyright and license statements are reproduced below, all new
+work on the libutf8proc library is licensed under the [MIT "expat"
+license](http://opensource.org/licenses/MIT):
+
+*Copyright © 2014 by Steven G. Johnson.*
+
+Permission is hereby granted, free of charge, to any person obtaining a
+copy of this software and associated documentation files (the "Software"),
+to deal in the Software without restriction, including without limitation
+the rights to use, copy, modify, merge, publish, distribute, sublicense,
+and/or sell copies of the Software, and to permit persons to whom the
+Software is furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in
+all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+DEALINGS IN THE SOFTWARE.
+
+== Original utf8proc license ==
+
+*Copyright (c) 2009, 2013 Public Software Group e. V., Berlin, Germany*
+
+Permission is hereby granted, free of charge, to any person obtaining a
+copy of this software and associated documentation files (the "Software"),
+to deal in the Software without restriction, including without limitation
+the rights to use, copy, modify, merge, publish, distribute, sublicense,
+and/or sell copies of the Software, and to permit persons to whom the
+Software is furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in
+all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+DEALINGS IN THE SOFTWARE.
+
+== Unicode data license ==
+
+This software distribution contains derived data from a modified version of
+the Unicode data files. The following license applies to that data:
+
+**COPYRIGHT AND PERMISSION NOTICE**
+
+*Copyright (c) 1991-2007 Unicode, Inc. All rights reserved. Distributed
+under the Terms of Use in http://www.unicode.org/copyright.html.*
+
+Permission is hereby granted, free of charge, to any person obtaining a
+copy of the Unicode data files and any associated documentation (the "Data
+Files") or Unicode software and any associated documentation (the
+"Software") to deal in the Data Files or Software without restriction,
+including without limitation the rights to use, copy, modify, merge,
+publish, distribute, and/or sell copies of the Data Files or Software, and
+to permit persons to whom the Data Files or Software are furnished to do
+so, provided that (a) the above copyright notice(s) and this permission
+notice appear with all copies of the Data Files or Software, (b) both the
+above copyright notice(s) and this permission notice appear in associated
+documentation, and (c) there is clear notice in each modified Data File or
+in the Software as well as in the documentation associated with the Data
+File(s) or Software that the data or software has been modified.
+
+THE DATA FILES AND SOFTWARE ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
+KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OF
+THIRD PARTY RIGHTS. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR HOLDERS
+INCLUDED IN THIS NOTICE BE LIABLE FOR ANY CLAIM, OR ANY SPECIAL INDIRECT OR
+CONSEQUENTIAL DAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF
+USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER
+TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
+PERFORMANCE OF THE DATA FILES OR SOFTWARE.
+
+Except as contained in this notice, the name of a copyright holder shall
+not be used in advertising or otherwise to promote the sale, use or other
+dealings in these Data Files or Software without prior written
+authorization of the copyright holder.
+
+Unicode and the Unicode logo are trademarks of Unicode, Inc., and may be
+registered in some jurisdictions. All other trademarks and registered
+trademarks mentioned herein are the property of their respective owners.
diff --git a/Makefile b/Makefile
@@ -9,20 +9,12 @@ cc = $(CC) $(cflags)
# meta targets
-c-library: libutf8proc.a libutf8proc.so
-
-ruby-library: ruby/utf8proc_native.so
-
-pgsql-library: pgsql/utf8proc_pgsql.so
+all: c-library
-all: c-library ruby-library ruby-gem pgsql-library
+c-library: libutf8proc.a libutf8proc.so
-clean::
+clean:
rm -f utf8proc.o libutf8proc.a libutf8proc.so
- cd ruby/ && test -e Makefile && (make clean && rm -f Makefile) || true
- rm -Rf ruby/gem/lib ruby/gem/ext
- rm -f ruby/gem/utf8proc-*.gem
- cd pgsql/ && make clean
# real targets
@@ -39,30 +31,3 @@ libutf8proc.so: utf8proc.o
libutf8proc.dylib: utf8proc.o
$(cc) -dynamiclib -o $@ $^ -install_name $(libdir)/$@
-
-ruby/Makefile: ruby/extconf.rb
- cd ruby && ruby extconf.rb
-
-ruby/utf8proc_native.so: utf8proc.h utf8proc.c utf8proc_data.c \
- ruby/utf8proc_native.c ruby/Makefile
- cd ruby && make
-
-ruby/gem/lib/utf8proc.rb: ruby/utf8proc.rb
- test -e ruby/gem/lib || mkdir ruby/gem/lib
- cp ruby/utf8proc.rb ruby/gem/lib/
-
-ruby/gem/ext/extconf.rb: ruby/extconf.rb
- test -e ruby/gem/ext || mkdir ruby/gem/ext
- cp ruby/extconf.rb ruby/gem/ext/
-
-ruby/gem/ext/utf8proc_native.c: utf8proc.h utf8proc_data.c utf8proc.c ruby/utf8proc_native.c
- test -e ruby/gem/ext || mkdir ruby/gem/ext
- cat utf8proc.h utf8proc_data.c utf8proc.c ruby/utf8proc_native.c | grep -v '#include "utf8proc.h"' | grep -v '#include "utf8proc_data.c"' | grep -v '#include "../utf8proc.c"' > ruby/gem/ext/utf8proc_native.c
-
-ruby-gem:: ruby/gem/lib/utf8proc.rb ruby/gem/ext/extconf.rb ruby/gem/ext/utf8proc_native.c
- cd ruby/gem && gem build utf8proc.gemspec
-
-pgsql/utf8proc_pgsql.so: utf8proc.h utf8proc.c utf8proc_data.c \
- pgsql/utf8proc_pgsql.c
- cd pgsql && make
-
diff --git a/README b/README
@@ -1,63 +0,0 @@
-
-Please read the LICENSE file, which is shipping with this software.
-
-
-*** QUICK START ***
-
-For compilation of the C library call "make c-library", for compilation of
-the ruby library call "make ruby-library" and for compilation of the
-PostgreSQL extension call "make pgsql-library".
-
-For ruby you can also create a gem-file by calling "make ruby-gem".
-
-"make all" can be used to build everything, but both ruby and PostgreSQL
-installations are required in this case.
-
-
-*** GENERAL INFORMATION ***
-
-The C library is found in this directory after successful compilation and
-is named "libutf8proc.a" and "libutf8proc.so". The ruby library consists of
-the files "utf8proc.rb" and "utf8proc_native.so", which are found in the
-subdirectory "ruby/". If you chose to create a gem-file it is placed in the
-"ruby/gem" directory. The PostgreSQL extension is named "utf8proc_pgsql.so"
-and resides in the "pgsql/" directory.
-
-Both the ruby library and the PostgreSQL extension are built as stand-alone
-libraries and are therefore not dependent the dynamic version of the
-C library files, but this behaviour might change in future releases.
-
-The Unicode version being supported is 5.0.0.
-Note: Version 4.1.0 of Unicode Standard Annex #29 was used, as
- version 5.0.0 had not been available at the time of implementation.
-
-For Unicode normalizations, the following options have to be used:
-Normalization Form C: STABLE, COMPOSE
-Normalization Form D: STABLE, DECOMPOSE
-Normalization Form KC: STABLE, COMPOSE, COMPAT
-Normalization Form KD: STABLE, DECOMPOSE, COMPAT
-
-
-*** C LIBRARY ***
-
-The documentation for the C library is found in the utf8proc.h header file.
-"utf8proc_map" is most likely function you will be using for mapping UTF-8
-strings, unless you want to allocate memory yourself.
-
-
-*** TODO ***
-
-- detect stable code points and process segments independently in order to
- save memory
-- do a quick check before normalizing strings to optimize speed
-- support stream processing
-
-
-*** CONTACT ***
-
-If you find any bugs or experience difficulties in compiling this software,
-please contact us:
-
-Project page: http://www.public-software-group.org/utf8proc
-
-
diff --git a/README.md b/README.md
@@ -0,0 +1,68 @@
+== libutf8proc ==
+
+The [libutf8proc package](https://github.com/JuliaLang/libutf8proc) is
+a lightly updated fork of the [utf8proc
+library](http://www.public-software-group.org/utf8proc) from Jan
+Behrens and the rest of the [Public Software
+Group](http://www.public-software-group.org/), who deserve *nearly all
+of the credit* for this package: a small, clean C library that
+provides Unicode normalization, case-folding, and other operations for
+data in the [UTF-8 encoding](http://en.wikipedia.org/wiki/UTF-8).
+
+The reason for this fork is that utf8proc is used for basic Unicode
+support in the [Julia language](http://julialang.org/) and the Julia
+developers wanted Unicode 7 support and other features, but the
+Public Software Group currently does not seem to have the resources
+necessary to update utf8proc. We hope that the fork can be merged
+back into the mainline utf8proc package before too long.
+
+(The original utf8proc package also includes Ruby and PostgreSQL plug-ins.
+We removed those from libutf8proc in order to focus exclusively on the C
+library for the time being. We will strive to keep API changes to a minimum,
+so libutf8proc should still be usable with the old plug-in code.)
+
+Like utf8proc, the libutf8proc package is licensed under the
+free/open-source [MIT "expat"
+license](http://opensource.org/licenses/MIT) (plus certain Unicode
+data governed by the similarly permissive [Unicode data
+license](http://www.unicode.org/copyright.html#Exhibit1)); please see
+the included `LICENSE.md` file for more detailed information.
+
+=== Quick Start ===
+
+For compilation of the C library run `make`.
+
+=== General Information ===
+
+The C library is found in this directory after successful compilation
+and is named `libutf8proc.a` (for the static library) and
+`libutf8proc.so` (for the dynamic library).
+
+The Unicode version being supported is 5.0.0.
+*Note:* Version 4.1.0 of Unicode Standard Annex #29 was used, as
+version 5.0.0 had not been available at the time of implementation.
+
+For Unicode normalizations, the following options are used:
+
+* Normalization Form C: `STABLE`, COMPOSE`
+* Normalization Form D: `STABLE`, `DECOMPOSE`
+* Normalization Form KC: `STABLE`, `COMPOSE`, `COMPAT`
+* Normalization Form KD: `STABLE`, `DECOMPOSE`, `COMPAT`
+
+=== C Library ===
+
+The documentation for the C library is found in the `utf8proc.h` header file.
+`utf8proc_map` is function you will most likely be using for mapping UTF-8
+strings, unless you want to allocate memory yourself.
+
+=== To Do ===
+
+* detect stable code points and process segments independently in order to save memory
+* do a quick check before normalizing strings to optimize speed
+* support stream processing
+
+=== Contact ===
+
+Bug reports, feature requests, and other queries can be filed at
+the [libutf8proc page on Github](https://github.com/JuliaLang/libutf8proc).
+