%PDF-1.3 %âãÏÓ 2 0 obj << /Length 7364 >> stream BT /TT2 1 Tf 14 0 0 14 275.358 691.2 Tm 0 g /GS1 gs 0 Tc 0 Tw [(Berk)10(eley DB)]TJ /TT4 1 Tf 12 0 0 12 270.372 662.4 Tm (Michael A. Olson)Tj 1.0675 -1.2 TD [(K)25(eith Bostic)]TJ -0.336 -1.2 TD [(Mar)18(go Seltzer)]TJ /TT6 1 Tf -1.9715 -1.32 TD [(Sleepycat Softwar)37(e)10(,)10( )-10(Inc.)]TJ /TT2 1 Tf 2.9485 -3 TD (Abstract)Tj /TT4 1 Tf 10 0 0 10 79.2 565.56 Tm 0.0424 Tw [(Berk)10(ele)15(y)-292.5(D)0(B)-292.5(i)0(s)-292.4(a)0(n)-292.4(Open Source embedded database system with a number of k)10(e)15(y)15( )-15(adv)25(antages o)15(v)15(e)0(r)-292.4(comparable sys-)]TJ 0 -1.2 TD 0.0602 Tw [(tems. )-250(It)-310.2(is simple to use, supports concurrent access by multiple users, and pro)15(vides industrial-strength transaction)]TJ T* 0.1554 Tw [(support, including survi)25(ving system and disk crashes.)-655.5(This paper describes the design and technical features of)]TJ T* 0 Tw [(Berk)10(ele)15(y)-250(DB, the distrib)20(ution, and its license.)]TJ /TT2 1 Tf 12 0 0 12 79.2 505.56 Tm 0.25 Tw [(1. Intr)18(oduction)]TJ /TT4 1 Tf 10 0 0 10 79.2 489.36 Tm 0.0691 Tw [(The Berk)10(ele)15(y)-319.1(Database \(Berk)10(ele)15(y)-319.1(DB\) is an embedded)]TJ T* 0.0253 Tw [(database system that can be used in applications requir)20(-)]TJ T* 0.1636 Tw [(ing high-performance concurrent storage and retrie)25(v)25(a)0(l)]TJ T* 0.2618 Tw [(of k)10(e)15(y/v)25(alue pairs.)-761.9(The softw)10(are is distrib)20(uted as a)]TJ T* 0.0057 Tw [(library that can be link)10(ed directly into an application.)-505.8(It)]TJ T* 0.1453 Tw [(pro)15(vides a v)25(ariety of programmatic interf)10(aces, includ-)]TJ T* 0.0237 Tw [(ing callable APIs for C, C++, Perl, Tcl and Ja)20(v)25(a)0(.)-523.7(Users)]TJ T* 0.0326 Tw [(may do)25(wnload Berk)10(ele)15(y)-282.7(D)0(B)-282.7(from Sleep)10(ycat Softw)10(are’)55(s)]TJ T* 0 Tw [(W)80(e)0(b)-250(site, at)]TJ /TT6 1 Tf 4.918 0 TD [(www)74(.sleepycat.com)]TJ /TT4 1 Tf 7.8132 0 TD (.)Tj -12.7313 -1.62 TD 0.133 Tw [(Sleep)10(ycat distrib)20(utes Berk)10(ele)15(y)-383(D)0(B)-383(a)0(s)-383(a)0(n)-383(Open Source)]TJ 0 -1.2 TD 0.08 Tw [(product. )-250(The)-330(compan)15(y)-330(collects license fees for certain)]TJ T* 0 Tw [(uses of the softw)10(are and sells support and services.)]TJ /TT2 1 Tf 12 0 0 12 79.2 323.16 Tm 0.25 Tw (1.1. History)Tj /TT4 1 Tf 10 0 0 10 79.2 306.96 Tm 0.0558 Tw [(Berk)10(ele)15(y)-305.7(D)0(B)-305.7(b)0(e)15(g)15( )295.8(an)-305.8(as)-305.8(a)-305.8(n)0(e)25(w)25( )-25(implementation of a hash)]TJ T* 0.0843 Tw (access method to replace both)Tj /TT8 1 Tf 12.6665 0 TD 0 Tw (hsearch)Tj /TT4 1 Tf 4.5349 0 TD 0.0842 Tw [(and the v)25(ari-)]TJ -17.2014 -1.2 TD 0 Tw (ous)Tj /TT8 1 Tf 1.9358 0 TD (dbm)Tj /TT4 1 Tf 2.3469 0 TD 0.2967 Tw (implementations \()Tj /TT8 1 Tf 7.5452 0 TD 0 Tw (dbm)Tj /TT4 1 Tf 2.347 0 TD 0.2967 Tw [(from A)111(T&T)74(,)]TJ /TT8 1 Tf 5.8239 0 TD 0 Tw (ndbm)Tj /TT4 1 Tf -19.9988 -1.2 TD 0.1334 Tw [(from Berk)10(ele)15(y)65(,)65( )-65(and)]TJ /TT8 1 Tf 8.3073 0 TD 0 Tw (gdbm)Tj /TT4 1 Tf 2.7838 0 TD 0.1334 Tw [(from the GNU project\).)-633.3(In)]TJ -11.0911 -1.2 TD 0.0367 Tw [(1990 Seltzer and Y)55(igit produced a package called Hash)]TJ T* 0 Tw (to do this [Selt91].)Tj 0 -1.62 TD (The )Tj /TT9 1 Tf 2.1153 0 TD (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.3106 Tw [(rst general release of Berk)10(ele)15(y)-560.6(DB, in 1991,)]TJ -2.6714 -1.2 TD 0.3038 Tw [(included some interf)10(ace changes and a ne)25(w)-553.9(B+tree)]TJ T* 0.0886 Tw [(access method.)-588.7(At roughly the same time, Seltzer and)]TJ T* 0.1201 Tw [(Olson de)25(v)15(eloped a prototype transaction system based)]TJ T* 0.3355 Tw [(on Berk)10(ele)15(y)-585.6(DB, called LIBTP [Selt92], b)20(ut ne)25(v)15(e)0(r)]TJ T* 0 Tw (released the code.)Tj 0 -1.62 TD 0.0653 Tw [(The 4.4BSD UNIX release included Berk)10(ele)15(y)-315.3(D)0(B)-315.3(1.85)]TJ 0 -1.2 TD 0.0601 Tw [(in 1992.)-560.1(Seltzer and Bostic maintained the code in the)]TJ T* 0.1545 Tw [(early 1990s in Berk)10(ele)15(y)-404.6(and in Massachusetts.)-654.6(Man)15(y)]TJ T* 0 Tw (users adopted the code during this period.)Tj 0 -1.62 TD 0.0431 Tw [(By mid-1996, users w)10(anted commercial support for the)]TJ 0 -1.2 TD 0.4533 Tw [(softw)10(are. )-250(In)-703.3(response, Bostic and Seltzer formed)]TJ T* 1.0128 Tw [(Sleep)10(ycat Softw)10(are. )-250(The)-1262.7(compan)15(y)-1512.7(enhances,)]TJ 24.4 42.72 TD 0.1623 Tw [(distrib)20(utes, and supports Berk)10(ele)15(y)-412.3(D)0(B)-412.4(and supporting)]TJ 0 -1.2 TD 0.22 Tw [(softw)10(are and documentation.)-720(Sleep)10(ycat released v)15(e)0(r)20(-)]TJ T* 0.1677 Tw [(sion 2.1 of Berk)10(ele)15(y)-417.7(D)0(B)-417.8(i)0(n)-417.8(mid-1997 with important)]TJ T* 0.006 Tw [(ne)25(w)-256(features, including support for concurrent access to)]TJ T* 0.1677 Tw [(databases. )-249.9(The)-417.6(compan)15(y)-417.7(mak)10(es about three commer)20(-)]TJ T* 0.0957 Tw [(cial releases a year)40(,)-345.8(and most recently shipped v)15(ersion)]TJ T* 0 Tw (2.8.)Tj /TT2 1 Tf 12 0 0 12 323.2 403.56 Tm [(1.2. )-250(Ov)10(er)10(view of Berk)10(eley DB)]TJ /TT4 1 Tf 10 0 0 10 323.2 387.36 Tm 0.3094 Tw [(The C interf)10(aces in Berk)10(ele)15(y)-559.4(D)0(B)-559.5(permit)]TJ /TT8 1 Tf 18.3756 0 TD 0 Tw (dbm)Tj /TT4 1 Tf 1.8003 0 TD (-style)Tj -20.1759 -1.2 TD 0.4586 Tw (record management for databases, with signi)Tj /TT9 1 Tf 20.1758 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD (cant)Tj -20.732 -1.2 TD -0.0001 Tc 0.1274 Tw [(e)14.9(xtensions to handle duplicate data items ele)14.9(gantly)64.9(,)-377.4(t)-0.1(o)]TJ T* 0 Tc 0.2427 Tw [(deal with concurrent access, and to pro)15(vide transac-)]TJ T* 0.071 Tw (tional support so that multiple changes can be simulta-)Tj T* 0.1273 Tw [(neously committed \(so that the)15(y)-377.3(are made permanent\))]TJ T* 0.1848 Tw (or rolled back \(so that the database is restored to its)Tj T* 0 Tw [(state at the be)15(ginning of the transaction\).)]TJ 0 -1.62 TD 0.1033 Tw [(C++ and Ja)20(v)25(a)25( )-25.1(interf)10(aces pro)15(vide a small set of classes)]TJ 0 -1.2 TD 0.1961 Tw [(for operating on a database.)-696.1(The main class in both)]TJ T* 0.0587 Tw (cases is called)Tj /TT8 1 Tf 6.0901 0 TD 0 Tw (Db)Tj /TT4 1 Tf 1.2002 0 TD 0.0586 Tw [(,)-308.6(and pro)15(vides methods that encapsu-)]TJ -7.2903 -1.2 TD 0.1128 Tw (late the)Tj /TT8 1 Tf 3.3906 0 TD 0 Tw (dbm)Tj /TT4 1 Tf 1.8003 0 TD 0.1129 Tw [(-style interf)10(aces that the C interf)10(aces pro-)]TJ -5.1909 -1.2 TD 0 Tw (vide.)Tj 0 -1.62 TD 0.2564 Tw [(Tcl and Perl interf)10(aces allo)25(w)-506.4(d)0(e)25(v)25( )496.4(elopers w)10(orking in)]TJ 0 -1.2 TD 0.1716 Tw [(those languages to use Berk)10(ele)15(y)-421.6(D)0(B)-421.6(i)0(n)-421.7(their applica-)]TJ T* 0.0919 Tw [(tions. )-250(Bindings)-341.9(for both languages are included in the)]TJ T* 0 Tw [(distrib)20(ution.)]TJ 0 -1.62 TD 0.1069 Tw [(De)25(v)15(elopers may compile their applications and link in)]TJ 0 -1.2 TD 0 Tw [(Berk)10(ele)15(y)-250(D)0(B)-250(statically or dynamically)65(.)]TJ /TT2 1 Tf 12 0 0 12 323.2 128.76 Tm [(1.3. )-250(Ho)10(w)-250(Berk)10(eley DB is used)]TJ /TT4 1 Tf 10 0 0 10 323.2 112.56 Tm 0.0654 Tw [(The Berk)10(ele)15(y)-315.5(D)0(B)-315.4(library supports concurrent access to)]TJ T* 0.2616 Tw [(databases. )-249.9(It)-511.5(can be link)10(ed into standalone applica-)]TJ T* 0.1487 Tw (tions, into a collection of cooperating applications, or)Tj T* 0.421 Tw [(into serv)15(ers that handle requests and do database)]TJ ET endstream endobj 3 0 obj << /ProcSet [/PDF /Text ] /Font << /TT2 4 0 R /TT4 5 0 R /TT6 6 0 R /TT8 7 0 R /TT9 8 0 R >> /ExtGState << /GS1 9 0 R >> >> endobj 12 0 obj << /Length 8732 >> stream BT /TT4 1 Tf 10 0 0 10 79.2 708 Tm 0 g /GS1 gs 0 Tc 0 Tw (operations on behalf of clients.)Tj 0 -1.62 TD 0.0858 Tw (Compared to using a standalone database management)Tj 0 -1.2 TD 0.0846 Tw [(system, Berk)10(ele)15(y)-334.6(D)0(B)-334.6(i)0(s)-334.6(easy to understand and simple)]TJ T* 0.3826 Tw [(to use.)-882.6(The softw)10(are stores and retrie)25(v)15(e)0(s)-632.5(records,)]TJ T* 0.277 Tw [(which consist of k)10(e)15(y/v)25(alue pairs.)-777(K)25(e)25( )517(ys)-527(are used to)]TJ T* 0.0698 Tw [(locate items and can be an)15(y)-319.8(data type or structure sup-)]TJ T* 0 Tw (ported by the programming language.)Tj 0 -1.62 TD 0.0813 Tw [(The programmer can pro)15(vide the functions that Berk)10(e-)]TJ 0 -1.2 TD 0.0763 Tw [(le)15(y)-326.4(D)0(B)-326.4(uses to operate on k)10(e)15(ys. )-250(F)15(or e)15(xample, B+trees)]TJ T* 0.172 Tw (can use a custom comparison function, and the Hash)Tj T* 0.0519 Tw [(access method can use a custom hash function.)-551.8(Berk)10(e-)]TJ T* 0.2722 Tw [(le)15(y)-522.2(D)0(B)-522.2(uses def)10(ault functions if none are supplied.)]TJ T* 0.0873 Tw [(Otherwise, Berk)10(ele)15(y)-337.3(D)0(B)-337.3(does not e)15(xamine or interpret)]TJ T* 0.0934 Tw [(either k)10(e)15(ys)-343.4(or)-343.4(v)25(alues in an)15(y)-343.4(w)10(ay)65(.)-593.4(V)111(alues may be arbi-)]TJ T* 0 Tw (trarily long.)Tj 0 -1.62 TD 0.069 Tw [(It is also important to understand what Berk)10(ele)15(y)-319(D)0(B)-319(i)0(s)]TJ 0 -1.2 TD 0.1865 Tw [(not. )-250(It)-436.5(is not a database serv)15(er that handles netw)10(ork)]TJ T* 0.0296 Tw [(requests. )-250.1(It)-279.7(is not an SQL engine that e)15(x)15(ecutes queries.)]TJ T* 0.1547 Tw (It is not a relational or object-oriented database man-)Tj T* 0 Tw (agement system.)Tj 0 -1.62 TD 0.1101 Tw [(It is possible to b)20(uild an)15(y)-360.1(o)0(f)-360.1(those on top of Berk)10(ele)15(y)]TJ 0 -1.2 TD 0.2116 Tw [(DB, b)20(ut the package, as distrib)20(uted, is an embedded)]TJ T* 0.1444 Tw [(database engine.)-644.4(It has been designed to be portable,)]TJ T* 0 Tw [(small, f)10(ast, and reliable.)]TJ /TT2 1 Tf 12 0 0 12 79.2 385.2 Tm [(1.4. )-250(A)25(pplications that use Berk)10(eley DB)]TJ /TT4 1 Tf 10 0 0 10 79.2 369 Tm 0.1749 Tw [(Berk)10(ele)15(y)-424.8(D)0(B)-424.8(i)0(s)-424.9(embedded in a v)25(ariety of proprietary)]TJ T* 0.384 Tw [(and Open Source softw)10(are packages.)-884(This section)]TJ T* 0 Tw [(highlights a fe)25(w)-250(o)0(f)-250(the products that use it.)]TJ 0 -1.62 TD 0.1467 Tw [(Directory serv)15(ers, which do data storage and retrie)25(v)25(a)0(l)]TJ 0 -1.2 TD 0.2823 Tw [(using the Local Directory Access Protocol \(LD)40(AP\),)]TJ T* 0.0956 Tw [(pro)15(vide naming and directory lookup service on local-)]TJ T* 0.2837 Tw [(area netw)10(orks. )-250(This)-533.7(service is, essentially)65(,)-533.6(database)]TJ T* 0.0039 Tw [(query and update, b)20(ut uses a simple protocol rather than)]TJ T* 0.2201 Tw [(SQL or ODBC.)-720.1(Berk)10(ele)15(y)-470.1(D)0(B)-470.1(i)0(s)-470.1(the embedded data)]TJ T* 0.1288 Tw [(manager in the majority of deplo)10(yed directory serv)15(ers)]TJ T* 0.2355 Tw [(today)65(,)-485.5(including LD)40(AP serv)15(ers from Netscape, Mes-)]TJ T* 0 Tw (sageDirect \(formerly Isode\), and others.)Tj 0 -1.62 TD 0.1886 Tw [(Berk)10(ele)15(y)-438.5(D)0(B)-438.5(i)0(s)-438.5(also embedded in a lar)18(ge number of)]TJ 0 -1.2 TD 0.5302 Tw [(mail serv)15(ers. )-250(Intermail,)-780.2(from Softw)10(are.com, uses)]TJ T* 0.2114 Tw [(Berk)10(ele)15(y)-461.3(D)0(B)-461.3(a)0(s)-461.3(a)-461.3(message store and as the backing)]TJ T* 0.3597 Tw [(store for its directory serv)15(er)55(.)-859.7(The sendmail serv)15(er)]TJ T* 0.1175 Tw [(\(including both the commercial Sendmail Pro of)25(fering)]TJ T* 0.3282 Tw [(from Sendmail, Inc. and the v)15(ersion distrib)20(uted by)]TJ T* 0.2304 Tw [(sendmail.or)18(g\) uses Berk)10(ele)15(y)-480.4(D)0(B)-480.4(t)0(o)-480.4(store aliases and)]TJ T* 0.901 Tw [(other information.)-1401(Similarly)65(,)-1151(Post)]TJ /TT9 1 Tf 16.3592 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.901 Tw (x \(formerly)Tj -16.9153 -1.2 TD 0.3465 Tw [(VMailer\) uses Berk)10(ele)15(y)-596.5(D)0(B)-596.5(t)0(o)-596.5(store administrati)25(v)15(e)]TJ T* 0 Tw (information.)Tj 0 -1.62 TD 0.0133 Tw [(In addition, Berk)10(ele)15(y)-263.4(D)0(B)-263.3(i)0(s)-263.3(embedded in a wide v)25(ariety)]TJ 0 -1.2 TD 0.4994 Tw [(of other softw)10(are products.)-999.4(Example applications)]TJ 24.4 62.76 TD 0.0373 Tw [(include managing access control lists, storing user k)10(e)15(ys)]TJ 0 -1.2 TD 0.275 Tw [(in a public-k)10(e)15(y)15( )-15(infrastructure, recording machine-to-)]TJ T* 0.0518 Tw [(netw)10(ork-address mappings in address serv)15(ers, and stor)20(-)]TJ T* 0.0411 Tw (ing con)Tj /TT9 1 Tf 3.0128 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.0411 Tw [(guration and de)25(vice information in video post-)]TJ -3.5689 -1.2 TD 0 Tw [(production softw)10(are.)]TJ 0 -1.62 TD 0.2477 Tw [(Finally)65(,)-497.8(Berk)10(ele)15(y)-497.8(D)0(B)-497.8(i)0(s)-497.8(a)-497.8(part of man)15(y)-497.7(other Open)]TJ 0 -1.2 TD 0.0005 Tw [(Source softw)10(are packages a)20(v)25(ailable on the Internet.)-500.6(F)15(o)0(r)]TJ T* 0.0604 Tw [(e)15(xample, the softw)10(are is embedded in the Apache W)80(e)0(b)]TJ T* 0 Tw [(serv)15(er and the Gnome desktop.)]TJ /TT2 1 Tf 12 0 0 12 323.2 577.8 Tm 0.25 Tw [(2. Access)-250(Methods)]TJ /TT4 1 Tf 10 0 0 10 323.2 561.6 Tm 0.0828 Tw [(In database terminology)65(,)-332.9(a)0(n)-332.9(access method is the disk-)]TJ T* 0.1964 Tw (based structure used to store data and the operations)Tj T* 0.6053 Tw [(a)20(v)25(ailable on that structure.)-1105.3(F)15(o)0(r)-855.4(e)15(xample, man)15(y)]TJ T* 0.3853 Tw (database systems support a B+tree access method.)Tj T* 0.1203 Tw [(B+trees allo)25(w)-370.3(equality-based lookups \()]TJ /TT9 1 Tf 16.1276 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.1203 Tw [(nd k)10(e)15(ys)-370.4(equal)]TJ -16.6837 -1.2 TD 0.4 Tw (to some constant\), range-based lookups \()Tj /TT9 1 Tf 18.3848 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.4 Tw [(nd k)10(e)15(ys)]TJ -18.9409 -1.2 TD 0.1188 Tw [(between tw)10(o)-368.8(constants\) and record insertion and dele-)]TJ T* 0 Tw (tion.)Tj 0 -1.62 TD 0.2228 Tw [(Berk)10(ele)15(y)-472.9(D)0(B)-472.9(supports three access methods: B+tree,)]TJ 0 -1.2 TD 0.1553 Tw [(Extended Linear Hashing \(Hash\), and Fix)15(ed- or V)111(ari-)]TJ T* 0.3638 Tw [(able-length Records \(Recno\).)-863.8(All three operate on)]TJ T* 0.1956 Tw [(records composed of a k)10(e)15(y)15( )-15(and a data v)25(alue. )-250(In)-445.6(the)]TJ T* 0.1301 Tw [(B+tree and Hash access methods, k)10(e)15(ys)-380.1(can ha)20(v)15(e)15( )-15(arbi-)]TJ T* 0.3595 Tw [(trary structure.)-859.5(In the Recno access method, each)]TJ T* 0.0265 Tw [(record is assigned a record number)40(,)-276.5(which serv)15(es as the)]TJ T* 0.0306 Tw [(k)10(e)15(y)65(.)65( )-315(In)-280.6(all the access methods, the v)25(alue can ha)20(v)15(e)15( )-15(arbi-)]TJ T* 0.1417 Tw [(trary structure.)-641.7(The programmer can supply compari-)]TJ T* 0.2129 Tw [(son or hashing functions for k)10(e)15(ys, and Berk)10(ele)15(y)-462.9(D)0(B)]TJ T* 0 Tw [(stores and retrie)25(v)15(e)0(s)-250(v)25(alues without interpreting them.)]TJ 0 -1.62 TD 0.1069 Tw (All of the access methods use the host )Tj /TT9 1 Tf 16.3518 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.1069 Tw (lesystem as a)Tj -16.9079 -1.2 TD 0 Tw (backing store.)Tj /TT2 1 Tf 12 0 0 12 323.2 283.2 Tm 0.25 Tw (2.1. Hash)Tj /TT4 1 Tf 10 0 0 10 323.2 267 Tm 0.3986 Tw [(Berk)10(ele)15(y)-648.5(D)0(B)-648.5(includes a Hash access method that)]TJ T* 0.9862 Tw [(implements e)15(xtended linear hashing [Litw80].)]TJ T* 0.0017 Tw (Extended linear hashing adjusts the hash function as the)Tj T* 0.0506 Tw [(hash table gro)25(ws, attempting to k)10(eep all b)20(uck)10(ets under)20(-)]TJ T* 0 Tw (full in the steady state.)Tj 0 -1.62 TD 0.1649 Tw (The Hash access method supports insertion and dele-)Tj 0 -1.2 TD 0.0259 Tw [(tion of records and lookup by e)15(xact match only)65(.)-525.8(Appli-)]TJ T* 0.0038 Tw [(cations may iterate o)15(v)15(e)0(r)-253.8(all records stored in a table, b)20(u)0(t)]TJ T* 0 Tw [(the order in which the)15(y)-250(are returned is unde)]TJ /TT9 1 Tf 17.423 0 TD (Þ)Tj /TT4 1 Tf 0.5562 0 TD (ned.)Tj /TT2 1 Tf 12 0 0 12 323.2 136.8 Tm 0.25 Tw [(2.2. B+tr)18(ee)]TJ /TT4 1 Tf 10 0 0 10 323.2 120.5999 Tm 0.4683 Tw [(Berk)10(ele)15(y)-718.4(D)0(B)-718.4(includes a B+tree [Come79] access)]TJ 0 -1.2 TD 0.0002 Tw [(method. )-250(B+trees)-250.2(store records of k)10(e)15(y/v)25(alue pairs in leaf)]TJ T* 0.052 Tw [(pages, and pairs of \(k)10(e)15(y)65(,)-302(child page address\) at internal)]TJ T* 0.2885 Tw [(nodes. )-249.9(K)25(e)15(ys)-538.4(in)-538.4(the tree are stored in sorted order)40(,)]TJ ET endstream endobj 13 0 obj << /ProcSet [/PDF /Text ] /Font << /TT2 4 0 R /TT4 5 0 R /TT9 8 0 R >> /ExtGState << /GS1 9 0 R >> >> endobj 15 0 obj << /Length 9078 >> stream BT /TT4 1 Tf 10 0 0 10 79.2 708 Tm 0 g /GS1 gs 0 Tc 0.0576 Tw (where the order is determined by the comparison func-)Tj 0 -1.2 TD 0.0815 Tw [(tion supplied when the database w)10(as created.)-581.5(P)15(ages at)]TJ T* 0.0389 Tw [(the leaf le)25(v)15(e)0(l)-288.9(o)0(f)-288.9(the tree include pointers to their neigh-)]TJ T* 0.1444 Tw [(bors to simplify tra)20(v)15(ersal. )-250(B+trees)-394.4(support lookup by)]TJ T* 0.0068 Tw [(e)15(xact match \(equality\) or range \(greater than or equal to)]TJ T* 0.0391 Tw [(a)-289.1(k)10(e)15(y\). )-250(Lik)10(e)-289.1(Hash tables, B+trees support record inser)20(-)]TJ T* 0 Tw [(tion, deletion, and iteration o)15(v)15(e)0(r)-250(all records in the tree.)]TJ 0 -1.62 TD 0.0646 Tw (As records are inserted and pages in the B+tree )Tj /TT9 1 Tf 19.7215 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.0646 Tw (ll up,)Tj -20.2777 -1.2 TD 0.0223 Tw [(the)15(y)-272.2(are split, with about half the k)10(e)15(ys)-272.3(going into a ne)25(w)]TJ T* 0.1603 Tw [(peer page at the same le)25(v)15(e)0(l)-410.3(i)0(n)-410.3(the tree.)-660.3(Most B+tree)]TJ T* 0.0387 Tw [(implementations lea)20(v)15(e)15( )-15(both nodes half-full after a split.)]TJ T* 0.2763 Tw (This leads to poor performance in a common case,)Tj T* 0.1522 Tw [(where the caller inserts k)10(e)15(ys)-402.2(in)-402.2(order)55(.)-652.2(T)80(o)-402.3(handle this)]TJ T* 0.1642 Tw [(case, Berk)10(ele)15(y)-414.3(D)0(B)-414.3(k)10(eeps track of the insertion order)40(,)]TJ T* 0.2023 Tw [(and splits pages une)25(v)15(enly to k)10(eep pages fuller)55(.)-702.4(This)]TJ T* 0.23 Tw (reduces tree size, yielding better search performance)Tj T* 0 Tw (and smaller databases.)Tj 0 -1.62 TD 0.3177 Tw [(On deletion, empty pages are coalesced by re)25(v)15(erse)]TJ 0 -1.2 TD 0.203 Tw [(splits into single pages.)-703(The access method does no)]TJ T* 0.0347 Tw [(other page balancing on insertion or deletion.)-534.8(K)25(e)25( )274.7(ys)-284.8(are)]TJ T* 0.1926 Tw [(not mo)15(v)15(e)0(d)-442.7(among pages at e)25(v)15(ery update to k)10(eep the)]TJ T* 0.2206 Tw [(tree well-balanced.)-720.6(While this could impr)]TJ 17.9583 0 TD -0.015 Tc 0 Tw (ove )Tj 1.8846 0 TD 0 Tc (search)Tj -19.8428 -1.2 TD 0.2341 Tw [(times in some cases, the additional code comple)15(xity)]TJ T* 0 Tw [(leads to slo)25(wer updates and is prone to deadlocks.)]TJ 0 -1.62 TD 0.0449 Tw [(F)15(o)0(r)-294.8(simplicity)65(,)-294.8(Berk)10(ele)15(y)-294.9(D)0(B)-294.9(B+trees do no pre)]TJ /TT9 1 Tf 18.9923 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.0449 Tw (x com-)Tj -19.5485 -1.2 TD 0 Tw [(pression of k)10(e)15(ys)-250(at)-250(internal or leaf nodes.)]TJ /TT2 1 Tf 12 0 0 12 79.2 365.4 Tm 0.25 Tw (2.3. Recno)Tj /TT4 1 Tf 10 0 0 10 79.2 349.2 Tm 0.0236 Tw [(Berk)10(ele)15(y)-273.6(D)0(B)-273.6(includes a )]TJ /TT9 1 Tf 9.8443 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.0235 Tw [(x)15(ed- or v)25(ariable-length record)]TJ -10.4005 -1.2 TD 0.5075 Tw (access method, called)Tj /TT6 1 Tf 10.4629 0 TD 0 Tw (Recno)Tj /TT4 1 Tf 2.4985 0 TD 0.5075 Tw [(.)-1007.5(The Recno access)]TJ -12.9615 -1.2 TD 0.0896 Tw (method assigns logical record numbers to each record,)Tj T* 0.0978 Tw (and can search for and update records by record num-)Tj T* 0.0037 Tw [(ber)55(.)-503.7(Recno is able, for e)15(xample, to load a te)15(xt )]TJ /TT9 1 Tf 18.6121 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.0036 Tw (le into a)Tj -19.1682 -1.2 TD 0.1514 Tw [(database, treating each line as a record.)-651.4(This permits)]TJ T* 0.1313 Tw [(f)10(ast searches by line number for applications lik)10(e)-381.2(t)0(e)15(x)0(t)]TJ T* 0 Tw (editors [Ston82].)Tj 0 -1.62 TD 0.259 Tw [(Recno is actually b)20(uilt on top of the B+tree access)]TJ 0 -1.2 TD 0.3191 Tw [(method and pro)15(vides a simple interf)10(ace for storing)]TJ T* 0.314 Tw [(sequentially-ordered data v)25(alues. )-250(The)-564(Recno access)]TJ T* 0.2266 Tw [(method generates k)10(e)15(ys)-476.6(internally)65(.)-726.6(The programmer’)55(s)]TJ T* 0.1602 Tw [(vie)25(w)-410.2(o)0(f)-410.2(the v)25(alues is that the)15(y)-410.2(are numbered sequen-)]TJ T* 0.0254 Tw [(tially from one.)-525.4(De)25(v)15(elopers can choose to ha)20(v)15(e)15( )-14.9(records)]TJ T* 0.9 Tw [(automatically renumbered when lo)25(wer)20(-numbered)]TJ T* 0.0041 Tw [(records are added or deleted.)-504.1(In this case, ne)25(w)-254.1(k)10(e)15(y)0(s)-254.1(can)]TJ T* 0 Tw [(be inserted between e)15(xisting k)10(e)15(ys.)]TJ /TT2 1 Tf 12 0 0 12 79.2 123 Tm 0.25 Tw [(3. F)25(eatur)18(es)]TJ /TT4 1 Tf 10 0 0 10 79.2 106.8 Tm 0.1827 Tw [(This section describes important features of Berk)10(ele)15(y)]TJ T* 0.0956 Tw [(DB. )-250(In)-345.6(general, de)25(v)15(elopers can choose which features)]TJ T* 0.0488 Tw (are useful to them, and use only those that are required)Tj 24.4 62.52 TD 0 Tw (by their application.)Tj 0 -1.62 TD 0.1029 Tw [(F)15(o)0(r)-352.9(e)15(xample, when an application opens a database, it)]TJ 0 -1.2 TD 0.0101 Tw [(can declare the de)15(gree of concurrenc)15(y)-260.1(and reco)15(v)15(ery that)]TJ T* 0.0048 Tw [(it requires.)-504.9(Simple stand-alone applications, and in par)20(-)]TJ T* 0.0491 Tw (ticular ports of applications that used)Tj /TT8 1 Tf 15.3464 0 TD 0 Tw (dbm)Tj /TT4 1 Tf 2.0994 0 TD 0.0491 Tw (or one of its)Tj -17.4458 -1.2 TD 0.1093 Tw [(v)25(ariants, generally do not require concurrent access or)]TJ T* 0.0975 Tw [(crash reco)15(v)15(ery)65(.)-597.5(Other applications, such as enterprise-)]TJ T* 0.308 Tw (class database management systems that store sales)Tj T* 0.2643 Tw (transactions or other critical data, need full transac-)Tj T* 0.393 Tw [(tional service.)-893(Single-user operation is f)10(aster than)]TJ T* 0.1175 Tw [(multi-user operation, since no o)15(v)15(erhead is incurred by)]TJ T* 0.0655 Tw [(locking. )-250.1(Running)-315.6(with the reco)15(v)15(ery system disabled is)]TJ T* 0.1732 Tw [(f)10(aster than running with it enabled, since log records)]TJ T* 0.2703 Tw (need not be written when changes are made to the)Tj T* 0 Tw (database.)Tj 0 -1.62 TD 0.0851 Tw (In addition, some core subsystems, including the lock-)Tj 0 -1.2 TD 0.0344 Tw [(ing system and the logging f)10(acility)65(,)-284.4(can be used outside)]TJ T* 0.1772 Tw [(the conte)15(xt of the access methods as well.)-677.3(Although)]TJ T* 0.1784 Tw [(fe)25(w)-428.4(users ha)20(v)15(e)15( )-15(chosen to do so, it is possible to use)]TJ T* 0.0939 Tw [(only the lock manager in Berk)10(ele)15(y)-343.9(D)0(B)-343.9(t)0(o)-343.9(control con-)]TJ T* 0.2242 Tw [(currenc)15(y)-474.3(i)0(n)-474.3(a)0(n)-474.3(application, without using an)15(y)-474.2(o)0(f)-474.2(the)]TJ T* 0.0158 Tw [(standard database services.)-515.8(Alternati)25(v)15(ely)65(,)-265.8(the caller can)]TJ T* 0.007 Tw [(inte)15(grate locking of non-database resources with Berk)10(e-)]TJ T* 0.2702 Tw [(le)15(y)-520.1(DB’)55(s)-520.1(transactional tw)10(o-phase locking system, to)]TJ T* 0.2892 Tw (impose transaction semantics on objects outside the)Tj T* 0 Tw (database.)Tj /TT2 1 Tf 12 0 0 12 323.2 369.6 Tm 0.25 Tw [(3.1. Pr)18(ogrammatic )250(interfaces)]TJ /TT4 1 Tf 10 0 0 10 323.2 353.4 Tm 0 Tw [(Berk)10(ele)15(y)-400.8(D)0(B)-400.8(d)0(e)]TJ /TT9 1 Tf 6.719 0 TD (Þ)Tj /TT4 1 Tf 0.5561 0 TD 0.1509 Tw (nes a simple API for database man-)Tj -7.2751 -1.2 TD 0.0952 Tw [(agement. )-250(The)-345.2(package does not include industry-stan-)]TJ T* 0.1898 Tw [(dard programmatic interf)10(aces such as Open Database)]TJ T* 0.0852 Tw [(Connecti)25(vity \(ODBC\), Object Linking and Embedding)]TJ T* 0.0817 Tw (for Databases \(OleDB\), or Structured Query Language)Tj T* 0.1527 Tw [(\(SQL\). )-250(These)-402.7(interf)10(aces, while useful, were designed)]TJ T* 0.2477 Tw (to promote interoperability of database systems, and)Tj T* 0 Tw (not simplicity or performance.)Tj 0 -1.62 TD 0.3192 Tw [(In response to customer demand, Berk)10(ele)15(y)-569.1(D)0(B)-569.1(2.5)]TJ 0 -1.2 TD 0.0538 Tw [(introduced support for the XA standard [Open94].)-553.9(XA)]TJ T* 0.052 Tw [(permits Berk)10(ele)15(y)-302(D)0(B)-302(t)0(o)-302(participate in distrib)20(uted trans-)]TJ T* 0.3373 Tw [(actions under a transaction processing monitor lik)10(e)]TJ T* 0.131 Tw [(T)45(u)0(x)15(edo from BEA Systems.)-631(Lik)10(e)-381(XA, other standard)]TJ T* 0.099 Tw [(interf)10(aces can be b)20(uilt on top of the core system.)-599(The)]TJ T* 0.0846 Tw [(standards do not belong inside Berk)10(ele)15(y)-334.6(DB, since not)]TJ T* 0 Tw (all applications need them.)Tj /TT2 1 Tf 12 0 0 12 323.2 139.2 Tm [(3.2. )-250(W)75(orking with r)18(ecords)]TJ /TT4 1 Tf 10 0 0 10 323.2 123 Tm 0.0634 Tw [(A)-313.4(database user may need to search for particular k)10(e)15(ys)]TJ T* 0.0907 Tw [(in a database, or may simply w)10(ant to bro)25(wse a)20(v)25(ailable)]TJ T* 0.1601 Tw [(records. )-250(Berk)10(ele)15(y)-410.1(D)0(B)-410.1(supports both k)10(e)15(yed access, to)]TJ /TT9 1 Tf T* 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.0173 Tw [(nd one or more records with a gi)25(v)15(e)0(n)-267.3(k)10(e)15(y)65(,)-267.3(o)0(r)-267.3(sequential)]TJ -0.5562 -1.2 TD 0.053 Tw [(access, to retrie)25(v)15(e)15( )-15(all the records in the database one at)]TJ ET endstream endobj 16 0 obj << /ProcSet [/PDF /Text ] /Font << /TT2 4 0 R /TT4 5 0 R /TT6 6 0 R /TT8 7 0 R /TT9 8 0 R >> /ExtGState << /GS1 9 0 R >> >> endobj 18 0 obj << /Length 8968 >> stream BT /TT4 1 Tf 10 0 0 10 79.2 708 Tm 0 g /GS1 gs 0 Tc 0.384 Tw [(a)-634(time. )-250(The)-634(order of the records returned during)]TJ 0 -1.2 TD 0.0208 Tw [(sequential scans depends on the access method.)-520.9(B+tree)]TJ T* 0.1495 Tw [(and Recno databases return records in sort order)40(,)-399.5(and)]TJ T* 0.0023 Tw [(Hash databases return them in apparently random order)55(.)]TJ 0 -1.62 TD 0 Tw [(Similarly)65(,)-495.9(Berk)10(ele)15(y)-495.9(D)0(B)-495.8(d)0(e)]TJ /TT9 1 Tf 11.3122 0 TD (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.2458 Tw [(nes simple interf)10(aces for)]TJ -11.8683 -1.2 TD 0 Tw (inserting, updating, and deleting records in a database.)Tj /TT2 1 Tf 12 0 0 12 79.2 613.8 Tm [(3.3. )-250(Long)-250(k)10(eys and v)10(alues)]TJ /TT4 1 Tf 10 0 0 10 79.2 597.6 Tm 0.1053 Tw [(Berk)10(ele)15(y)-355.3(D)0(B)-355.3(manages k)10(e)15(ys)-355.3(and v)25(alues as lar)18(ge as 2)]TJ 8 0 0 8 295.1806 602.6 Tm 0 Tw (32)Tj 10 0 0 10 79.2 585.6 Tm 0.0692 Tw [(bytes. )-250(Since)-319.2(the time required to cop)10(y)-319.2(a)-319.2(record is pro-)]TJ T* 0.1895 Tw [(portional to its size, Berk)10(ele)15(y)-439.6(D)0(B)-439.6(includes interf)10(aces)]TJ T* 0.4507 Tw [(that operate on partial records.)-950.7(If an application)]TJ T* 0.1273 Tw [(requires only part of a lar)18(ge record, it requests partial)]TJ T* 0.0025 Tw [(record retrie)25(v)25(al, and recei)25(v)15(e)0(s)-252.6(just the bytes that it needs.)]TJ T* 0 Tw [(The smaller cop)10(y)-250(s)0(a)20(v)20( )245(es)-250(both time and memory)65(.)]TJ 0 -1.62 TD 0.0706 Tw [(Berk)10(ele)15(y)-320.6(D)0(B)-320.6(allo)25(ws the programmer to de)]TJ /TT9 1 Tf 17.3687 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.0706 Tw (ne the data)Tj -17.9249 -1.2 TD 0.272 Tw [(types of k)10(e)15(ys)-522(and v)25(alues. )-250(De)25(v)15(elopers use an)15(y)-522(type)]TJ T* 0 Tw [(e)15(xpressible in the programming language.)]TJ /TT2 1 Tf 12 0 0 12 79.2 455.4 Tm 0.25 Tw [(3.4. Lar)10(ge )250(databases)]TJ /TT4 1 Tf 10 0 0 10 79.2 439.2 Tm 0.0755 Tw [(A)-325.5(single database managed by Berk)10(ele)15(y)-325.6(D)0(B)-325.6(can be up)]TJ T* 0.1716 Tw (to 2)Tj 8 0 0 8 96.1943 432.2 Tm 0 Tw (48)Tj 10 0 0 10 108.4103 427.2 Tm 0.1716 Tw [(bytes, or 256 petabytes, in size.)-671.5(Berk)10(ele)15(y)-421.5(D)0(B)]TJ -2.921 -1.2 TD 0.2144 Tw (uses the host )Tj /TT9 1 Tf 6.004 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.2144 Tw (lesystem as the backing store for the)Tj -6.5602 -1.2 TD 0.2667 Tw [(database, so lar)18(ge databases require big )]TJ /TT9 1 Tf 17.6034 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.2667 Tw (le support)Tj -18.1595 -1.2 TD 0.3113 Tw [(from the operating system.)-811.3(Sleep)10(ycat Softw)10(are has)]TJ T* 0.5711 Tw [(customers using Berk)10(ele)15(y)-821.2(D)0(B)-821.2(t)0(o)-821.1(manage single)]TJ T* -0.0001 Tc 0.0001 Tw [(databases in e)14.9(xcess of 100 gigabytes.)]TJ /TT2 1 Tf 12 0 0 12 79.2 337.2 Tm 0 Tc 0.25 Tw [(3.5. Main)-250(memory )250(databases)]TJ /TT4 1 Tf 10 0 0 10 79.2 321 Tm 0.1171 Tw (Applications that do not require persistent storage can)Tj T* 0.0119 Tw [(create databases that e)15(xist only in main memory)65(.)-511.8(These)]TJ T* 0.0542 Tw [(databases bypass the o)15(v)15(erhead imposed by the I/O sys-)]TJ T* 0 Tw [(tem altogether)55(.)]TJ 0 -1.62 TD 0.2144 Tw (Some applications do need to use disk as a backing)Tj 0 -1.2 TD 0.2248 Tw [(store, b)20(ut run on machines with v)15(ery lar)18(ge memory)65(.)]TJ T* 0.0299 Tw [(Berk)10(ele)15(y)-279.9(D)0(B)-279.9(i)0(s)-279.9(able to manage v)15(ery lar)18(ge shared mem-)]TJ T* 0.0128 Tw [(ory re)15(gions for cached data pages, log records, and lock)]TJ T* 0.1437 Tw [(management. )-250.1(F)15(or e)15(xample, the cache re)15(gion used for)]TJ T* -0.0001 Tc 0.0034 Tw [(data pages may be gigabytes in size, reducing the lik)9.9(eli-)]TJ T* 0 Tc 0.0639 Tw [(hood that an)15(y)-313.9(read operation will need to visit the disk)]TJ T* 0.1201 Tw [(in the steady state.)-620.1(The programmer declares the size)]TJ T* 0 Tw [(of the cache re)15(gion at startup.)]TJ 0 -1.62 TD 0.4548 Tw [(Finally)65(,)-704.8(man)15(y)-704.8(operating systems pro)15(vide memory-)]TJ 0 -1.2 TD 0 Tw (mapped )Tj /TT9 1 Tf 3.6687 0 TD (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.2532 Tw [(le services that are much f)10(aster than their)]TJ -4.2249 -1.2 TD 0 Tw (general-purpose )Tj /TT9 1 Tf 6.9516 0 TD (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.2602 Tw [(le system interf)10(aces. )-250(Berk)10(ele)15(y)-510.2(D)0(B)]TJ -7.5078 -1.2 TD 0.5118 Tw (can memory-map its database )Tj /TT9 1 Tf 14.2093 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.5118 Tw (les for read-only)Tj -14.7655 -1.2 TD 0.3917 Tw [(database use.)-891.7(The application operates on records)]TJ T* 0.2069 Tw (stored directly on the pages, with no cache manage-)Tj T* 0.1556 Tw [(ment o)15(v)15(erhead. )-250.1(Because)-405.7(the application gets pointers)]TJ 24.4 62.34 TD 0.1265 Tw [(directly into the Berk)10(ele)15(y)-376.5(D)0(B)-376.5(pages, writes cannot be)]TJ 0 -1.2 TD 0.1275 Tw [(permitted. )-250(Otherwise,)-377.5(changes could bypass the lock-)]TJ T* 0.023 Tw [(ing and logging systems, and softw)10(are errors could cor)20(-)]TJ T* 0.4006 Tw [(rupt the database.)-900.7(Read-only applications can use)]TJ T* 0 Tw [(Berk)10(ele)15(y)-289.3(DB’)55(s)-289.3(memory-mapped )]TJ /TT9 1 Tf 13.3397 0 TD (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.0393 Tw [(le service to impro)15(v)15(e)]TJ -13.8958 -1.2 TD 0 Tw (performance on most architectures.)Tj /TT2 1 Tf 12 0 0 12 323.2 618 Tm 0.25 Tw (3.6. Con)Tj /TT10 1 Tf 3.7783 0 TD 0 Tw (Þ)Tj /TT2 1 Tf 0.5562 0 TD [(gurable)-250(page size)]TJ /TT4 1 Tf 10 0 0 10 323.2 601.8 Tm 0.0111 Tw (Programmers declare the size of the pages used by their)Tj 0 -1.2 TD 0.0403 Tw [(access methods when the)15(y)-290.3(create a database.)-540.3(Although)]TJ T* 0.1546 Tw [(Berk)10(ele)15(y)-404.6(D)0(B)-404.6(pro)15(vides reasonable def)10(aults, de)25(v)15(elopers)]TJ T* 0.364 Tw [(may o)15(v)15(erride them to control system performance.)]TJ T* 0.0793 Tw (Small pages reduce the number of records that )Tj /TT9 1 Tf 19.4611 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.0793 Tw (t on a)Tj -20.0172 -1.2 TD 0.0353 Tw [(single page.)-535.3(Fe)25(wer records on a page means that fe)25(wer)]TJ T* 0.0723 Tw [(records are lock)10(ed when the page is lock)10(ed, impro)15(ving)]TJ T* 0.1462 Tw [(concurrenc)15(y)65(.)65( )-315(The per)20(-page o)15(v)15(erhead is proportionally)]TJ T* 0.229 Tw [(higher with smaller pages, of course, b)20(ut de)25(v)15(elopers)]TJ T* 0 Tw [(can trade of)25(f)-250(space for time as an application requires.)]TJ /TT2 1 Tf 12 0 0 12 323.2 463.8 Tm 0.25 Tw [(3.7. Small)-250(f)25(ootprint)]TJ /TT4 1 Tf 10 0 0 10 323.2 447.6 Tm 0.1474 Tw [(Berk)10(ele)15(y)-397.3(D)0(B)-397.3(i)0(s)-397.4(a)-397.4(compact system.)-647.4(The full package,)]TJ T* 0.0831 Tw [(including all access methods, reco)15(v)15(erability)65(,)-333.1(and trans-)]TJ T* 0.1235 Tw [(action support is roughly 175K of te)15(xt space on com-)]TJ T* 0 Tw (mon architectures.)Tj /TT2 1 Tf 12 0 0 12 323.2 381.6 Tm 0.25 Tw (3.8. Cursors)Tj /TT4 1 Tf 10 0 0 10 323.2 365.4 Tm 0.157 Tw [(In database terminology)65(,)-407(a)-407(cursor is a pointer into an)]TJ T* 0.1806 Tw [(access method that can be called iterati)25(v)15(ely to return)]TJ T* 0.368 Tw [(records in sequence.)-868(Berk)10(ele)15(y)-618(D)0(B)-618(includes cursor)]TJ T* 0.2814 Tw [(interf)10(aces for all access methods.)-781.4(This permits, for)]TJ T* 0.034 Tw [(e)15(xample, users to tra)20(v)15(erse a B+tree and vie)25(w)-284(records in)]TJ T* 0.1234 Tw [(order)55(.)-623.3(Pointers to records in cursors are persistent, so)]TJ T* 0.1779 Tw (that once fetched, a record may be updated in place.)Tj T* 0.1939 Tw [(Finally)65(,)-443.8(cursors support access to chains of duplicate)]TJ T* 0 Tw [(data items in the v)25(arious access methods.)]TJ /TT2 1 Tf 12 0 0 12 323.2 239.4 Tm 0.25 Tw [(3.9. J)15(oins)]TJ /TT4 1 Tf 10 0 0 10 323.2 223.2 Tm 0.2702 Tw [(In database terminology)65(,)-520.3(a)-520.3(join is an operation that)]TJ T* 0.0616 Tw [(spans multiple separate tables \(or in the case of Berk)10(e-)]TJ T* 0.2018 Tw [(le)15(y)-451.8(DB, multiple separate DB )]TJ /TT9 1 Tf 13.1024 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.2017 Tw [(les\).)-701.7(F)15(o)0(r)-451.7(e)15(xample, a)]TJ -13.6586 -1.2 TD 0.0873 Tw [(compan)15(y)-337.2(may store information about its customers in)]TJ T* 0.1545 Tw [(one table and information about sales in another)55(.)-654.5(A)0(n)]TJ T* 0.1498 Tw [(application will lik)10(ely w)10(ant to look up sales informa-)]TJ T* 0.0933 Tw (tion by customer name; this requires matching records)Tj T* 0.228 Tw [(in the tw)10(o)-478(tables that share a common customer ID)]TJ /TT9 1 Tf T* 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.0015 Tw [(eld. )-250(This)-251.5(combining of records from multiple tables is)]TJ -0.5562 -1.2 TD 0 Tw (called a join.)Tj 0 -1.62 TD 0.3061 Tw [(Berk)10(ele)15(y)-556.1(D)0(B)-556.1(includes interf)10(aces for joining tw)10(o)-556.2(o)0(r)]TJ 0 -1.2 TD 0 Tw (more tables.)Tj ET endstream endobj 19 0 obj << /ProcSet [/PDF /Text ] /Font << /TT2 4 0 R /TT4 5 0 R /TT9 8 0 R /TT10 20 0 R >> /ExtGState << /GS1 9 0 R >> >> endobj 22 0 obj << /Length 9070 >> stream BT /TT2 1 Tf 12 0 0 12 79.2 708 Tm 0 g /GS1 gs 0 Tc 0.25 Tw [(3.10. T)74(ransactions)]TJ /TT4 1 Tf 10 0 0 10 79.2 691.8 Tm 0 Tw [(T)35(ransactions ha)20(v)15(e)15( )-15(four properties [Gray93]:)]TJ 8 0 0 8 84.2 675.6 Tm (•)Tj 10 0 0 10 104.2008 675.6 Tm 0.2989 Tw [(The)15(y)-548.9(are atomic.)-798.9(That is, all of the changes)]TJ 0 -1.2 TD 0.1475 Tw (made in a single transaction must be applied at)Tj T* 0.131 Tw [(the same instant or not at all.)-631(This permits, for)]TJ T* 0.3565 Tw [(e)15(xample, the transfer of mone)15(y)-606.5(between tw)10(o)]TJ T* 0.368 Tw (accounts to be accomplished, by making the)Tj T* 0.127 Tw (reduction of the balance in one account and the)Tj T* 0 Tw (increase in the other into a single, atomic action.)Tj 8 0 0 8 84.2 587.4 Tm (•)Tj 10 0 0 10 104.2008 587.4 Tm 0.0625 Tw [(The)15(y)-312.5(must be consistent.)-562.5(That is, changes to the)]TJ T* 0.3628 Tw [(database by an)15(y)-612.8(transaction cannot lea)20(v)15(e)15( )-15.1(the)]TJ T* -0.0001 Tc 0.0001 Tw [(database in an ille)14.9(gal)-250.1(o)-0.1(r)-250.1(corrupt state.)]TJ 8 0 0 8 84.2 547.2 Tm 0 Tc 0 Tw (•)Tj 10 0 0 10 104.2008 547.2 Tm -0.0001 Tc 0.0506 Tw [(The)14.9(y)-300.7(must be isolatable.)-550.7(Re)14.9(gardless of the num-)]TJ T* 0 Tc 0.08 Tw [(ber of users w)10(orking in the database at the same)]TJ T* 0.188 Tw [(time, e)25(v)15(ery user must ha)20(v)15(e)15( )-15(the illusion that no)]TJ T* 0 Tw [(other acti)25(vity is going on.)]TJ 8 0 0 8 84.2 495 Tm (•)Tj 10 0 0 10 104.2008 495 Tm 0.304 Tw [(The)15(y)-554(must be durable.)-804(Ev)15(en if the disk that)]TJ T* 0.0877 Tw (stores the database is lost, it must be possible to)Tj T* 0.0168 Tw [(reco)15(v)15(e)0(r)-266.8(the database to its last transaction-consis-)]TJ T* 0 Tw (tent state.)Tj -2.5 -1.62 TD 0.249 Tw [(This combination of properties — atomicity)65(,)-499(consis-)]TJ 0 -1.2 TD 0.3243 Tw [(tenc)15(y)65(,)65( )-64.9(isolation, and durability — is referred to as)]TJ T* 0.3458 Tw [(A)40(CIDity in the literature.)-845.9(Berk)10(ele)15(y)-595.8(DB, lik)10(e)-595.8(most)]TJ T* 0.0993 Tw [(database systems, pro)15(vides A)40(CIDity using a collection)]TJ T* 0 Tw (of core services.)Tj 0 -1.62 TD 0.0257 Tw [(Programmers can choose to use Berk)10(ele)15(y)-275.7(DB’)55(s)-275.7(transac-)]TJ 0 -1.2 TD 0 Tw (tion services for applications that need them.)Tj /TT2 1 Tf 12 0 0 12 79.2 336.6 Tm 0.25 Tw [(3.10.1. Write-ahead)-250(logging)]TJ /TT4 1 Tf 10 0 0 10 79.2 320.4 Tm 0.0479 Tw [(Programmers can enable the logging system when the)15(y)]TJ T* 0.0917 Tw [(start up Berk)10(ele)15(y)-341.8(DB. )-250.1(During)-341.7(a)-341.7(transaction, the appli-)]TJ T* 0.0493 Tw [(cation mak)10(es a series of changes to the database.)-549.4(Each)]TJ T* 0.0552 Tw [(change is captured in a log entry)65(,)-305.2(which holds the state)]TJ T* 0.0207 Tw (of the database record both before and after the change.)Tj T* 0.2208 Tw (The log record is guaranteed to be )Tj /TT9 1 Tf 15.4567 0 TD 0 Tw (ß)Tj /TT4 1 Tf 0.5562 0 TD 0.2208 Tw (ushed to stable)Tj -16.0129 -1.2 TD 0.0871 Tw [(storage before an)15(y)-337.1(o)0(f)-337.1(the changed data pages are writ-)]TJ T* 0.1489 Tw [(ten. )-250(This)-398.9(beha)20(vior — writing the log before the data)]TJ T* 0 Tw (pages — is called)Tj /TT6 1 Tf 7.3311 0 TD [(write-ahead lo)10(g)10(ging)]TJ /TT4 1 Tf 8.1182 0 TD (.)Tj -15.4492 -1.62 TD 0.0835 Tw [(At an)15(y)-333.5(time during the transaction, the application can)]TJ /TT6 1 Tf 0 -1.2 TD 0 Tw (commit)Tj /TT4 1 Tf 2.9438 0 TD 0.1702 Tw [(,)-420.2(making the changes permanent, or)]TJ /TT6 1 Tf 15.5162 0 TD 0.1701 Tw [(r)45(oll bac)20(k)]TJ /TT4 1 Tf 3.6876 0 TD 0 Tw (,)Tj -22.1477 -1.2 TD 0.0852 Tw (cancelling all changes and restoring the database to its)Tj T* 0.157 Tw [(pre-transaction state.)-657(If the application rolls back the)]TJ T* 0.1003 Tw (transaction, then the log holds the state of all changed)Tj T* 0.05 Tw [(pages prior to the transaction, and Berk)10(ele)15(y)-300(D)0(B)-300(simply)]TJ T* 0.0226 Tw [(restores that state.)-522.6(If the application commits the trans-)]TJ T* 0.0538 Tw [(action, Berk)10(ele)15(y)-303.8(D)0(B)-303.8(writes the log records to disk.)-553.7(In-)]TJ T* 0.2312 Tw (memory copies of the data pages already re)Tj /TT9 1 Tf 18.9719 0 TD 0 Tw (ß)Tj /TT4 1 Tf 0.5562 0 TD 0.2312 Tw (ect the)Tj -19.5281 -1.2 TD 0.1399 Tw (changes, and will be )Tj /TT9 1 Tf 8.9737 0 TD 0 Tw (ß)Tj /TT4 1 Tf 0.5562 0 TD 0.1399 Tw [(ushed as necessary during nor)20(-)]TJ -9.5298 -1.2 TD 0.235 Tw [(mal processing.)-735(Since log writes are sequential, b)20(u)0(t)]TJ T* 0.8732 Tw [(data page writes are random, this impro)15(v)15(e)0(s)]TJ 24.4 63.18 TD 0 Tw (performance.)Tj /TT2 1 Tf 12 0 0 12 323.2 678 Tm 0.25 Tw [(3.10.2. Crashes)-250(and )250(r)18(eco)10(v)10(ery)]TJ /TT4 1 Tf 10 0 0 10 323.2 661.8 Tm 0.1093 Tw [(Berk)10(ele)15(y)-359.2(DB’)55(s)-359.2(write-ahead log is used by the transac-)]TJ 0 -1.2 TD 0.0414 Tw [(tion system to commit or roll back transactions.)-541.4(It also)]TJ T* 0.073 Tw [(gi)25(v)15(e)0(s)-323(the reco)15(v)15(ery system the information that it needs)]TJ T* -0.0001 Tc 0.0825 Tw (to protect against data loss or corruption from crashes.)Tj T* 0 Tc 0.0204 Tw [(Berk)10(ele)15(y)-270.3(D)0(B)-270.3(i)0(s)-270.4(able to survi)25(v)15(e)15( )-15(application crashes, sys-)]TJ T* 0.0407 Tw [(tem crashes, and e)25(v)15(en)-290.8(catastrophic f)10(ailures lik)10(e)-290.7(the loss)]TJ T* 0 Tw [(of a hard disk, without losing an)15(y)-250(data.)]TJ 0 -1.62 TD 0.0538 Tw [(Survi)25(ving crashes requires data stored in se)25(v)15(eral dif)25(fer)20(-)]TJ 0 -1.2 TD 0.252 Tw [(ent places.)-752(During normal processing, Berk)10(ele)15(y)-502(D)0(B)]TJ T* 0.0766 Tw [(has copies of acti)25(v)15(e)15( )-15(log records and recently-used data)]TJ T* 0.1539 Tw [(pages in memory)65(.)-653.9(Log records are )]TJ /TT9 1 Tf 15.02 0 TD 0 Tw (ß)Tj /TT4 1 Tf 0.5562 0 TD 0.1539 Tw (ushed to the log)Tj -15.5762 -1.2 TD 0.0694 Tw [(disk when transactions commit.)-569.4(Data pages trickle out)]TJ T* 0.0008 Tw (to the data disk as pages m)Tj 10.7245 0 TD -0.015 Tc 0 Tw (ove )Tj 1.6646 0 TD 0 Tc 0.0008 Tw [(through the b)20(u)0(f)25(fer cache.)]TJ -12.3892 -1.2 TD 0.0191 Tw [(Periodically)65(,)-269.1(the system administrator backs up the data)]TJ T* 0.0278 Tw [(disk, creating a safe cop)10(y)-277.8(o)0(f)-277.8(the database at a particular)]TJ T* 0.0109 Tw [(instant. )-250(When)-260.9(the database is back)10(ed up, the log can be)]TJ T* 0.1337 Tw [(truncated. )-250.1(F)15(or maximum rob)20(ustness, the log disk and)]TJ T* 0 Tw [(data disk should be separate de)25(vices.)]TJ 0 -1.62 TD 0.129 Tw [(Dif)25(ferent system f)10(ailures can destro)10(y)-379(memory)65(,)-379(the log)]TJ 0 -1.2 TD 0.1106 Tw [(disk, or the data disk.)-610.6(Berk)10(ele)15(y)-360.6(D)0(B)-360.6(i)0(s)-360.6(able to survi)25(v)15(e)]TJ T* 0.0679 Tw [(the loss of an)15(y)-317.9(one of these repositories without losing)]TJ T* 0 Tw [(an)15(y)-250(committed transactions.)]TJ 0 -1.62 TD 0.1371 Tw [(If the computer’)55(s)-387.1(memory is lost, through an applica-)]TJ 0 -1.2 TD 0.1619 Tw (tion or operating system crash, then the log holds all)Tj T* 0.1788 Tw [(committed transactions.)-678.9(On restart, the reco)15(v)15(ery sys-)]TJ T* -0.0001 Tc 0.0491 Tw [(tem rolls the log forw)9.9(ard against the database, reapply-)]TJ T* 0 Tc 0.0681 Tw [(ing an)15(y)-318.1(changes to on-disk pages that were in memory)]TJ T* 0.014 Tw [(at the time of the crash.)-514(Since the log contains pre- and)]TJ T* 0.0956 Tw [(post-change state for transactions, the reco)15(v)15(ery system)]TJ T* 0.114 Tw [(also uses the log to restore an)15(y)-364(pages to their original)]TJ T* 0.1615 Tw [(state if the)15(y)-411.5(were modi)]TJ /TT9 1 Tf 9.7946 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.1615 Tw [(ed by transactions that ne)25(v)15(e)0(r)]TJ -10.3507 -1.2 TD 0 Tw (committed.)Tj 0 -1.62 TD 0.2051 Tw (If the data disk is lost, the system administrator can)Tj 0 -1.2 TD 0.0886 Tw [(restore the most recent cop)10(y)-338.6(from backup.)-588.6(The reco)15(v-)]TJ T* -0.0001 Tc 0.1299 Tw [(ery system will roll the entire log forw)9.9(ard against the)]TJ T* 0 Tc 0.264 Tw (original database, reapplying all committed changes.)Tj T* 0.4363 Tw (When it )Tj /TT9 1 Tf 4.316 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.4363 Tw [(nishes, the database will contain e)25(v)15(ery)]TJ -4.8721 -1.2 TD 0.0534 Tw [(change made by e)25(v)15(ery transaction that e)25(v)15(er)-303.4(committed.)]TJ 0 -1.62 TD 0.0494 Tw [(If the log disk is lost, then the reco)15(v)15(ery system can use)]TJ 0 -1.2 TD 0.1853 Tw [(the in-memory copies of log entries to roll back an)15(y)]TJ T* 0.0026 Tw (uncommitted transactions, )Tj /TT9 1 Tf 10.8084 0 TD 0 Tw (ß)Tj /TT4 1 Tf 0.5561 0 TD 0.0026 Tw (ush all in-memory database)Tj -11.3646 -1.2 TD 0.1659 Tw [(pages to the data disk, and shut do)25(wn gracefully)65(.)-665.8(A)0(t)]TJ T* 0.2204 Tw (that point, the system administrator can back up the)Tj T* 0.0039 Tw [(database disk, install a ne)25(w)-253.9(log disk, and restart the sys-)]TJ T* 0 Tw (tem.)Tj ET endstream endobj 23 0 obj << /ProcSet [/PDF /Text ] /Font << /TT2 4 0 R /TT4 5 0 R /TT6 6 0 R /TT9 8 0 R >> /ExtGState << /GS1 9 0 R >> >> endobj 25 0 obj << /Length 9301 >> stream BT /TT2 1 Tf 12 0 0 12 79.2 708 Tm 0 g /GS1 gs 0 Tc 0.25 Tw (3.10.3. Checkpoints)Tj /TT4 1 Tf 10 0 0 10 79.2 691.8 Tm 0.3585 Tw [(Berk)10(ele)15(y)-608.5(D)0(B)-608.5(includes a checkpointing service that)]TJ 0 -1.2 TD 0.0263 Tw [(interacts with the reco)15(v)15(ery system.)-526.3(During normal pro-)]TJ T* 0.2415 Tw (cessing, both the log and the database are changing)Tj T* 0.0924 Tw [(continually)65(.)-592.5(A)0(t)-342.5(a)0(n)15(y)15( )-15(gi)25(v)25( )332.4(en)-342.4(instant, the on-disk v)15(ersions)]TJ T* 0.0414 Tw [(of the tw)10(o)-291.4(are not guaranteed to be consistent.)-541.4(The log)]TJ T* 0.3838 Tw (probably contains changes that are not yet in the)Tj T* 0 Tw (database.)Tj 0 -1.62 TD 0.0085 Tw [(When an application mak)10(es a)]TJ /TT6 1 Tf 12.0556 0 TD 0 Tw [(c)15(hec)20(kpoint)]TJ /TT4 1 Tf 4.2961 0 TD 0.0086 Tw [(,)-258.6(all committed)]TJ -16.3517 -1.2 TD 0.0443 Tw (changes in the log up to that point are guaranteed to be)Tj T* 0.0631 Tw [(present on the data disk, too.)-563.1(Checkpointing is moder)20(-)]TJ T* 0.0045 Tw [(ately e)15(xpensi)25(v)15(e)15( )-15.1(during normal processing, b)20(ut limits the)]TJ T* 0 Tw [(time spent reco)15(v)15(ering from crashes.)]TJ 0 -1.62 TD 0.3117 Tw (After an application or operating system crash, the)Tj 0 -1.2 TD 0.7419 Tw [(reco)15(v)15(ery system only needs to go back tw)10(o)]TJ 0 -1.4 TD 0 Tw (checkpoints)Tj 7 0 0 7 126.9637 517.4 Tm (1)Tj 10 0 0 10 134.3397 513.4 Tm 0.1376 Tw [(to start rolling the log forw)10(ard. )-249.9(W)40(ithout)]TJ -5.514 -1.2 TD 0.3264 Tw [(checkpoints, there is no w)10(ay to be sure ho)25(w)-576.5(long)]TJ T* 0.0395 Tw [(restarting after a crash will tak)10(e. )-250(W)40(ith checkpoints, the)]TJ T* 0.0088 Tw [(restart interv)25(al can be )]TJ /TT9 1 Tf 8.8948 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.0089 Tw [(x)15(ed by the programmer)55(.)-508.9(Reco)15(v-)]TJ -9.4509 -1.2 TD 0.0668 Tw (ery processing can be guaranteed to complete in a sec-)Tj T* 0 Tw [(ond or tw)10(o.)]TJ 0 -1.62 TD 0.2457 Tw [(Softw)10(are crashes are much more common than disk)]TJ 0 -1.2 TD 0.0884 Tw [(f)10(ailures. )-250.1(Man)15(y)-338.5(d)0(e)25(v)25( )328.4(elopers w)10(ant to guarantee that soft-)]TJ T* 0.0158 Tw [(w)10(are b)20(ugs do not destro)10(y)-265.8(data, b)20(ut are willing to restore)]TJ T* 0.063 Tw [(from tape, and to tolerate a day or tw)10(o)-313.1(o)0(f)-313.1(lost w)10(ork, in)]TJ T* 0.089 Tw [(the unlikle)15(y)-339(e)25(v)15(ent of a disk crash.)-589(W)40(ith Berk)10(ele)15(y)-339(DB,)]TJ T* 0.1093 Tw [(programmers may truncate the log at checkpoints.)-609.2(As)]TJ T* 0.009 Tw [(long as the tw)10(o)-259(most recent checkpoints are present, the)]TJ T* 0.0106 Tw [(reco)15(v)15(ery system can guarantee that no committed trans-)]TJ T* 0.0611 Tw [(actions are lost after a softw)10(are crash.)-561.1(In this case, the)]TJ T* 0.1439 Tw [(reco)15(v)15(ery system does not require that the log and the)]TJ T* 0.1328 Tw [(data be on separate de)25(vices, although separating them)]TJ T* 0 Tw (can still impr)Tj 5.2769 0 TD -0.015 Tc (ove )Tj 1.6638 0 TD 0 Tc (performance by spreading out writes.)Tj /TT2 1 Tf 12 0 0 12 79.2 275.2 Tm 0.25 Tw [(3.10.4. T)74(w)10(o-phase )250(locking)]TJ /TT4 1 Tf 10 0 0 10 79.2 259 Tm 0.1915 Tw [(Berk)10(ele)15(y)-441.6(D)0(B)-441.6(pro)15(vides a service kno)25(wn as tw)10(o-phase)]TJ 0 -1.2 TD 0.0517 Tw [(locking. )-250(In)-301.7(order to reduce the lik)10(elihood of deadlocks)]TJ T* 0.2546 Tw [(and to guarantee A)40(CID properties, database systems)]TJ T* 0.0063 Tw [(manage locks in tw)10(o)-256.4(phases. )-250.1(First,)-256.4(during the operation)]TJ T* 0.1573 Tw [(of a transaction, the)15(y)-407.4(acquire locks, b)20(ut ne)25(v)15(e)0(r)-407.3(release)]TJ T* 0.3648 Tw [(them. )-249.9(Second,)-614.7(at the end of the transaction, the)15(y)]TJ T* 0.0235 Tw [(release locks, b)20(ut ne)25(v)15(e)0(r)-273.5(acquire them.)-523.5(In practice, most)]TJ T* 0.469 Tw [(database systems, including Berk)10(ele)15(y)-719(DB, acquire)]TJ T* 0.2314 Tw [(locks on demand o)15(v)15(e)0(r)-481.4(the course of the transaction,)]TJ T* 0 Tw (then )Tj /TT9 1 Tf 1.9717 0 TD (ß)Tj /TT4 1 Tf 0.5562 0 TD (ush the log, then release all locks.)Tj ET 0 G 1 J 1 j 0.32 w 10 M []0 d 1 i 79.23 141.39 m 223.23 141.39 l S BT 5 0 0 5 100.8 131 Tm (1)Tj 8 0 0 8 105.638 127.8 Tm 0.0422 Tw [(One checkpoint is not f)10(ar enough.)-542.2(The reco)15(v)15(ery system can-)]TJ -3.3047 -1.2 TD 0.0264 Tw [(not be sure that the most recent checkpoint completed — it may ha)20(v)15(e)]TJ T* 0.0917 Tw [(been interrupted by the crash that forced the reco)15(v)15(ery system to run)]TJ T* 0 Tw (in the )Tj /TT9 1 Tf 2.4995 0 TD (Þ)Tj /TT4 1 Tf 0.5562 0 TD (rst place.)Tj 10 0 0 10 323.2 708 Tm 0.0806 Tw [(Berk)10(ele)15(y)-330.6(D)0(B)-330.6(can lock entire database )]TJ /TT9 1 Tf 15.7853 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.0806 Tw [(les, which cor)20(-)]TJ -16.3414 -1.2 TD 0.0844 Tw [(respond to tables, or indi)25(vidual pages in them.)-584.4(It does)]TJ T* 0.2141 Tw [(no record-le)25(v)15(e)0(l)-464.1(locking. )-250(By)-464.1(shrinking the page size,)]TJ T* 0.3626 Tw [(ho)25(we)25(v)15(e)0(r)40(,)40( )-40.1(de)25(v)25( )602.6(elopers can guarantee that e)25(v)15(ery page)]TJ T* 0.2101 Tw [(holds only a small number of records.)-710.2(This reduces)]TJ T* 0 Tw (contention.)Tj 0 -1.62 TD 0.0388 Tw (If locking is enabled, then read and write operations on)Tj 0 -1.2 TD 0.2817 Tw [(a)-531.7(database acquire tw)10(o-phase locks, which are held)]TJ T* 0.3635 Tw [(until the transaction completes.)-863.5(Which objects are)]TJ T* 0.0738 Tw [(lock)10(ed and the order of lock acquisition depend on the)]TJ T* 0.0502 Tw [(w)10(orkload for each transaction.)-550.2(It is possible for tw)10(o)-300.2(o)0(r)]TJ T* 0.1315 Tw [(more transactions to deadlock, so that each is w)10(aiting)]TJ T* 0 Tw [(for a lock that is held by another)55(.)]TJ 0 -1.62 TD 0.0807 Tw [(Berk)10(ele)15(y)-330.7(D)0(B)-330.7(detects deadlocks and automatically rolls)]TJ 0 -1.2 TD 0.1825 Tw [(back one of the transactions.)-682.5(This releases the locks)]TJ T* 0.1925 Tw [(that it held and allo)25(ws the other transactions to con-)]TJ T* 0.0847 Tw [(tinue. )-249.9(The)-334.6(caller is noti)]TJ /TT9 1 Tf 9.8357 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.0847 Tw (ed that its transaction did not)Tj -10.3918 -1.2 TD 0.1747 Tw [(complete, and may restart it.)-674.7(De)25(v)15(elopers can specify)]TJ T* 0.0646 Tw [(the deadlock detection interv)25(al and the polic)15(y)-314.7(t)0(o)-314.7(use in)]TJ T* 0 Tw (choosing a transaction to roll back.)Tj 0 -1.62 TD 0.6686 Tw [(The tw)10(o-phase locking interf)10(aces are separately)]TJ 0 -1.2 TD 0.0927 Tw [(callable by applications that link Berk)10(ele)15(y)-342.7(DB, though)]TJ T* 0.314 Tw [(fe)25(w)-564(users ha)20(v)15(e)15( )-15(needed to use that f)10(acility directly)65(.)]TJ T* 0.2211 Tw [(Using these interf)10(aces, Berk)10(ele)15(y)-471.1(D)0(B)-471.2(pro)15(vides a f)10(ast,)]TJ T* 0.24 Tw (platform-portable locking system for general-purpose)Tj T* 0.0418 Tw [(use. )-249.9(It)-291.7(also lets users include non-database objects in a)]TJ T* 0.3497 Tw (database transaction, by controlling access to them)Tj T* 0 Tw [(e)15(xactly as if the)15(y)-250(were inside the database.)]TJ 0 -1.62 TD 0.0583 Tw [(The Berk)10(ele)15(y)-308.3(D)0(B)-308.4(t)0(w)10(o-phase locking f)10(acility is b)20(uilt on)]TJ 0 -1.2 TD 0.0608 Tw [(the f)10(astest correct locking primiti)25(v)15(e)0(s)-310.8(that are supported)]TJ T* 0.1967 Tw [(by the underlying architecture.)-696.7(In the current imple-)]TJ T* 0.0593 Tw [(mentation, this means that the locking system is dif)25(fer)20(-)]TJ T* 0.1709 Tw [(ent on the v)25(arious UNIX platforms, and is still more)]TJ T* 0.0695 Tw [(dif)25(ferent on W)40(indo)25(ws NT)74(.)-569.5(I)0(n)-319.5(our e)15(xperience, the most)]TJ T* 0 Tw (dif)Tj /TT9 1 Tf 1.0858 0 TD (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.2634 Tw (cult aspect of performance tuning is )Tj /TT9 1 Tf 16.1864 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.2634 Tw (nding the)Tj -18.3845 -1.2 TD 0.0882 Tw [(f)10(astest locking primiti)25(v)15(e)0(s)-338.3(that w)10(ork correctly on a par)20(-)]TJ T* 0.126 Tw [(ticular architecture and then inte)15(grating the ne)25(w)-376(inter)20(-)]TJ T* 0 Tw [(f)10(ace with the se)25(v)15(eral that we already support.)]TJ 0 -1.62 TD 0.0536 Tw [(The w)10(orld w)10(ould be a better place if the operating sys-)]TJ 0 -1.2 TD 0.2096 Tw [(tems community w)10(ould uniformly implement POSIX)]TJ T* 0.131 Tw [(locking primiti)25(v)15(e)0(s)-381(and w)10(ould guarantee that acquiring)]TJ T* 0.1085 Tw [(an uncontested lock w)10(as a f)10(ast operation.)-608.5(Locks must)]TJ T* 0.3641 Tw [(w)10(ork both among threads in a single process and)]TJ T* 0 Tw (among processes.)Tj /TT2 1 Tf 12 0 0 12 323.2 141 Tm 0.25 Tw [(3.11. Concurr)18(ency)]TJ /TT4 1 Tf 10 0 0 10 323.2 124.8 Tm 0.0383 Tw (Good performance under concurrent operation is a crit-)Tj T* 0.0766 Tw [(ical design point for Berk)10(ele)15(y)-326.6(DB. )-249.9(Although)-326.5(Berk)10(ele)15(y)]TJ T* 0.1961 Tw (DB is itself not multi-threaded, it is thread-safe, and)Tj T* 0.0547 Tw [(runs well in threaded applications.)-554.6(Philosophically)65(,)-304.6(w)0(e)]TJ T* 0.2264 Tw [(vie)25(w)-476.4(the use of threads and the choice of a threads)]TJ ET endstream endobj 26 0 obj << /ProcSet [/PDF /Text ] /Font << /TT2 4 0 R /TT4 5 0 R /TT6 6 0 R /TT9 8 0 R >> /ExtGState << /GS1 9 0 R >> >> endobj 28 0 obj << /Length 8993 >> stream BT /TT4 1 Tf 10 0 0 10 79.2 708 Tm 0 g /GS1 gs 0 Tc 0.0065 Tw [(package as a polic)15(y)-256.6(decision, and prefer to of)25(fer mecha-)]TJ 0 -1.2 TD 0.0042 Tw [(nism \(the ability to run threaded or not\), allo)25(wing appli-)]TJ T* 0 Tw [(cations to choose their o)25(wn policies.)]TJ 0 -1.62 TD 0.1947 Tw [(The locking, logging, and b)20(u)0(f)25(fer pool subsystems all)]TJ 0 -1.2 TD 0.0711 Tw (use shared memory or other OS-speci)Tj /TT9 1 Tf 15.4346 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.0711 Tw [(c sharing f)10(acili-)]TJ -15.9908 -1.2 TD 0.1713 Tw [(ties to communicate.)-671.3(Locks, b)20(u)0(f)25(fer pool fetches, and)]TJ T* 0.1061 Tw [(log writes beha)20(v)15(e)15( )-15(in)-356.1(the same w)10(ay across threads in a)]TJ T* 0.0032 Tw [(single process as the)15(y)-253.2(d)0(o)-253.2(across dif)25(ferent processes on a)]TJ T* 0 Tw (single machine.)Tj 0 -1.62 TD 0.0896 Tw (As a result, concurrent database applications may start)Tj 0 -1.2 TD 0.1651 Tw [(up a ne)25(w)-415.1(process for e)25(v)15(ery single user)40(,)-415.1(may create a)]TJ T* 0.2848 Tw [(single serv)15(er which spa)15(wns a ne)25(w)-534.9(thread for e)25(v)15(ery)]TJ T* 0 Tw [(client request, or may choose an)15(y)-250(polic)15(y)-250(i)0(n)-250(between.)]TJ 0 -1.62 TD 0.1128 Tw [(Berk)10(ele)15(y)-362.9(D)0(B)-362.9(has been carefully designed to minimize)]TJ 0 -1.2 TD 0.007 Tw [(contention and maximize concurrenc)15(y)65(.)65( )-315(The cache man-)]TJ T* 0.057 Tw [(ager allo)25(ws all threads or processes to bene)]TJ /TT9 1 Tf 17.6733 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.057 Tw (t from I/O)Tj -18.2295 -1.2 TD 0.2917 Tw [(done by one.)-791.7(Shared resources must sometimes be)]TJ T* 0.1804 Tw [(lock)10(ed for e)15(xclusi)25(v)15(e)15( )-15(access by one thread of control.)]TJ T* 0.0158 Tw [(W)80(e)80( )-79.9(ha)20(v)20( )260.8(e)-265.7(k)10(ept critical sections small, and are careful not)]TJ T* 0.1199 Tw (to hold critical resource locks across system calls that)Tj T* 0.0538 Tw [(could deschedule the locking thread or process.)-553.9(Sleep-)]TJ T* 0.0979 Tw [(ycat Softw)10(are has customers with hundreds of concur)20(-)]TJ T* 0 Tw [(rent users w)10(orking on a single database in production.)]TJ /TT2 1 Tf 12 0 0 12 79.2 401.4 Tm 0.25 Tw [(4. Engineering)-250(Philosoph)15(y)]TJ /TT4 1 Tf 10 0 0 10 79.2 385.2 Tm 0.1499 Tw [(Fundamentally)65(,)-399.8(Berk)10(ele)15(y)-399.8(D)0(B)-399.8(i)0(s)-399.9(a)-399.9(collection of access)]TJ T* 0.019 Tw [(methods with important f)10(acilities, lik)10(e)-269(logging, locking,)]TJ T* 0.1251 Tw [(and transactional access underlying them.)-625.2(In both the)]TJ T* 0.0991 Tw [(research and the commercial w)10(orld, the techniques for)]TJ T* 0.2727 Tw [(b)20(uilding systems lik)10(e)-522.7(Berk)10(ele)15(y)-522.7(D)0(B)-522.7(h)0(a)20(v)20( )517.7(e)-522.7(been well-)]TJ T* 0 Tw [(kno)25(wn for a long time.)]TJ 0 -1.62 TD 0.0442 Tw [(The k)10(e)15(y)15( )-15.1(adv)25(antage of Berk)10(ele)15(y)-294.2(D)0(B)-294.2(i)0(s)-294.2(the careful atten-)]TJ 0 -1.2 TD 0.1059 Tw (tion that has been paid to engineering details through-)Tj T* 0.1039 Tw [(out its life.)-603.9(W)80(e)80( )-80(ha)20(v)20( )348.9(e)-353.9(carefully designed the system so)]TJ T* 0.0452 Tw [(that the core f)10(acilities, lik)10(e)-295.2(locking and I/O, surf)10(ace the)]TJ T* 0.0971 Tw [(right interf)10(aces and are otherwise opaque to the caller)55(.)]TJ T* 0.0294 Tw [(As programmers, we understand the v)25(alue of simplicity)]TJ T* 0.0205 Tw [(and ha)20(v)15(e)15( )-15.1(w)10(ork)10(ed hard to simplify the interf)10(aces we sur)20(-)]TJ T* 0 Tw [(f)10(ace to users of the database system.)]TJ 0 -1.62 TD 0.2031 Tw [(Berk)10(ele)15(y)-453.1(D)0(B)-453.1(a)20(v)20(oids limits in the code.)-703.1(It places no)]TJ 0 -1.2 TD 0.0473 Tw [(practical limit on the size of k)10(e)15(ys, v)25(alues, or databases;)]TJ T* 0 Tw [(the)15(y)-250(may gro)25(w)-250(t)0(o)-250(occup)10(y)-250(the a)20(v)25(ailable storage space.)]TJ 0 -1.62 TD 0.1857 Tw [(The locking and logging subsystems ha)20(v)15(e)15( )-15(been care-)]TJ 0 -1.2 TD 0.0184 Tw (fully crafted to reduce contention and impr)Tj 17.2706 0 TD -0.015 Tc 0 Tw (ove )Tj 1.6822 0 TD 0 Tc (through-)Tj -18.9528 -1.2 TD 0.216 Tw (put by shrinking or eliminating critical sections, and)Tj T* 0 Tw [(reducing the sizes of lock)10(ed re)15(gions and log entries.)]TJ 0 -1.62 TD 0.2238 Tw (There is nothing in the design or implementation of)Tj 0 -1.2 TD 0.0318 Tw [(Berk)10(ele)15(y)-281.8(D)0(B)-281.8(that pushes the state of the art in database)]TJ T* 0.1044 Tw [(systems. )-250.1(Rather)40(,)-354.5(w)0(e)-354.5(h)0(a)20(v)20( )349.4(e)-354.5(been v)15(ery careful to get the)]TJ T* 0.4321 Tw [(engineering right.)-932.1(The result is a system that is)]TJ 24.4 62.76 TD 0.0366 Tw [(superior)40(,)-286.7(a)0(s)-286.7(a)0(n)-286.6(embedded database system, to an)15(y)-286.6(other)]TJ 0 -1.2 TD 0 Tw [(solution a)20(v)25(ailable.)]TJ 0 -1.62 TD 0.0811 Tw [(Most database systems trade of)25(f)-331.2(simplicity for correct-)]TJ 0 -1.2 TD 0.1651 Tw [(ness. )-250(Either)-415.1(the system is easy to use, or it supports)]TJ T* 0.117 Tw [(concurrent use and survi)25(v)15(e)0(s)-367(system f)10(ailures. )-250(Berk)10(ele)15(y)]TJ T* 0.1013 Tw (DB, because of its careful design and implementation,)Tj T* 0 Tw [(of)25(fers both simplicity and correctness.)]TJ 0 -1.62 TD 0.0759 Tw [(The system has a small footprint, mak)10(es simple opera-)]TJ 0 -1.2 TD 0.1012 Tw [(tions simple to carry out \(inserting a ne)25(w)-351.2(record tak)10(es)]TJ T* 0.116 Tw [(just a fe)25(w)-366(lines of code\), and beha)20(v)15(e)0(s)-366(correctly in the)]TJ T* 0.0527 Tw [(f)10(ace of hea)20(vy concurrent use, system crashes, and e)25(v)15(en)]TJ T* 0 Tw [(catastrophic f)10(ailures lik)10(e)-250(loss of a hard disk.)]TJ /TT2 1 Tf 12 0 0 12 323.2 537.6 Tm [(5. )-250(The)-250(Berk)10(eley DB 2.x Distrib)20(ution)]TJ /TT4 1 Tf 10 0 0 10 323.2 521.4 Tm 0.1671 Tw [(Berk)10(ele)15(y)-417.1(D)0(B)-417.1(i)0(s)-417.1(distrib)20(uted in source code form from)]TJ /TT6 1 Tf T* 0 Tw [(www)74(.sleepycat.com)]TJ /TT4 1 Tf 7.8132 0 TD 0.2321 Tw [(.)-732.2(Users are free to do)25(wnload and)]TJ -7.8132 -1.2 TD 0 Tw [(b)20(uild the softw)10(are, and to use it in their applications.)]TJ /TT2 1 Tf 12 0 0 12 323.2 467.4 Tm [(5.1. )-250(What)-250(is in the distrib)20(ution)]TJ /TT4 1 Tf 10 0 0 10 323.2 451.2 Tm 0.4827 Tw [(The distrib)20(ution is a compressed archi)25(v)15(e)15( )]TJ /TT9 1 Tf 19.2761 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.7328 Tw (le. It)Tj -19.8323 -1.2 TD 0.0057 Tw [(includes the source code for the Berk)10(ele)15(y)-255.6(D)0(B)-255.6(library)65(,)-255.6(a)0(s)]TJ T* 0.0453 Tw (well as documentation, test suites, and supporting utili-)Tj T* 0 Tw (ties.)Tj 0 -1.62 TD 0.2612 Tw [(The source code includes b)20(uild support for all sup-)]TJ 0 -1.2 TD 0.0254 Tw [(ported platforms.)-525.4(On UNIX systems Berk)10(ele)15(y)-275.5(D)0(B)-275.5(uses)]TJ T* 0.128 Tw (the GNU autocon)Tj /TT9 1 Tf 7.3097 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.128 Tw (guration tool,)Tj /TT8 1 Tf 5.8942 0 TD 0 Tw (autoconf)Tj /TT4 1 Tf 4.8008 0 TD [(,)-378(t)0(o)-378(iden-)]TJ -18.5608 -1.2 TD 0.0992 Tw [(tify the system and to b)20(uild the library and supporting)]TJ T* 0.3589 Tw [(utilities. Berk)10(ele)15(y)-358.9(D)0(B)-358.8(includes )250.1(speci)]TJ /TT9 1 Tf 15.2961 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.1088 Tw [(c b)20(uild en)40(viron-)]TJ -15.8523 -1.2 TD 0.0515 Tw [(ments for other platforms, such as VMS and W)40(indo)25(ws.)]TJ /TT2 1 Tf 12 0 0 12 323.2 309 Tm 0.25 Tw (5.1.1. Documentation)Tj /TT4 1 Tf 10 0 0 10 323.2 292.8 Tm 0.5008 Tw [(The distrib)20(uted system includes documentation in)]TJ T* 0.1626 Tw [(HTML format.)-662.6(The documentation is in tw)10(o)-412.7(parts: a)]TJ T* 0.0725 Tw (UNIX-style reference manual for use by programmers,)Tj T* 0 Tw (and a reference guide which is tutorial in nature.)Tj /TT2 1 Tf 12 0 0 12 323.2 226.8 Tm 0.25 Tw [(5.1.2. T)92(est )250(suite)]TJ /TT4 1 Tf 10 0 0 10 323.2 210.6 Tm 0.1107 Tw [(The softw)10(are also includes a complete test suite, writ-)]TJ T* 0.0154 Tw [(ten in Tcl.)-515.4(W)80(e)80( )-80(belie)25(v)15(e)15( )-15(that the test suite is a k)10(e)15(y)15( )-15(adv)25(an-)]TJ T* 0 Tw [(tage of Berk)10(ele)15(y)-250(D)0(B)-250(o)15(v)15(e)0(r)-250(comparable systems.)]TJ 0 -1.62 TD 0.2612 Tw [(First, the test suite allo)25(ws users who do)25(wnload and)]TJ 0 -1.2 TD 0.1731 Tw [(b)20(uild the softw)10(are to be sure that it is operating cor)20(-)]TJ T* 0 Tw [(rectly)65(.)]TJ 0 -1.62 TD 0.0893 Tw [(Second, the test suite allo)25(ws us, lik)10(e)-339.4(other commercial)]TJ 0 -1.2 TD 0.0535 Tw [(de)25(v)15(elopers of database softw)10(are, to e)15(x)15(ercise the system)]TJ T* 0.2256 Tw [(thoroughly at e)25(v)15(ery release.)-725.6(When we learn of ne)25(w)]TJ T* 0.1719 Tw [(b)20(ugs, we add them to the test suite.)-671.9(W)80(e)80( )-80(run the test)]TJ T* 0.5692 Tw [(suite continually during de)25(v)15(elopment c)15(ycles, and)]TJ ET endstream endobj 29 0 obj << /ProcSet [/PDF /Text ] /Font << /TT2 4 0 R /TT4 5 0 R /TT6 6 0 R /TT8 7 0 R /TT9 8 0 R >> /ExtGState << /GS1 9 0 R >> >> endobj 31 0 obj << /Length 8823 >> stream BT /TT4 1 Tf 10 0 0 10 79.2 708 Tm 0 g /GS1 gs 0 Tc 0.0314 Tw [(al)10(w)10(ays prior to release.)-531.4(The result is a much more reli-)]TJ 0 -1.2 TD 0 Tw (able system by the time it reaches beta release.)Tj /TT2 1 Tf 12 0 0 12 79.2 666 Tm 0.25 Tw [(5.2. Binary)-250(distrib)20(ution)]TJ /TT4 1 Tf 10 0 0 10 79.2 649.8 Tm 0.0893 Tw [(Sleep)10(ycat mak)10(es compiled libraries and general binary)]TJ T* 0 Tw [(distrib)20(utions a)20(v)25(ailable to customers for a fee.)]TJ /TT2 1 Tf 12 0 0 12 79.2 607.8 Tm 0.25 Tw [(5.3. Supported)-250(platf)25(orms)]TJ /TT4 1 Tf 10 0 0 10 79.2 591.6 Tm 0.3122 Tw [(Berk)10(ele)15(y)-562.3(D)0(B)-562.3(runs on an)15(y)-562.2(operating system with a)]TJ T* 0.0816 Tw [(POSIX 1003.1 interf)10(ace [IEEE96], which includes vir)20(-)]TJ T* 0.1997 Tw [(tually e)25(v)15(ery UNIX system.)-699.7(In addition, the softw)10(are)]TJ T* 0.285 Tw [(runs on VMS, W)40(indo)25(ws/95, W)40(indo)25(ws/98, and W)40(in-)]TJ T* 0.521 Tw [(do)25(ws/NT)74(.)-1021(Sleep)10(ycat Softw)10(are no longer supports)]TJ T* 0 Tw [(deplo)10(yment on sixteen-bit W)40(indo)25(ws systems.)]TJ /TT2 1 Tf 12 0 0 12 79.2 501.6 Tm [(6. )-250(Berk)10(eley DB 2.x Licensing)]TJ /TT4 1 Tf 10 0 0 10 79.2 485.4 Tm 0.0128 Tw [(Berk)10(ele)15(y)-262.7(D)0(B)-262.7(2.x is distrib)20(uted as an Open Source prod-)]TJ T* 0.2209 Tw [(uct. )-250(The)-470.9(softw)10(are is freely a)20(v)25(ailable from us at our)]TJ T* 0.0872 Tw [(W)80(e)0(b)-337.2(site, and in other media.)-587.2(Users are free to do)25(wn-)]TJ T* 0 Tw [(load the softw)10(are and b)20(uild applications with it.)]TJ 0 -1.62 TD 0.1022 Tw [(The 1.x v)15(ersions of Berk)10(ele)15(y)-352.2(D)0(B)-352.2(were co)15(v)15(ered by the)]TJ 0 -1.2 TD 0.3763 Tw [(UC Berk)10(ele)15(y)-626.3(cop)10(yright that co)15(v)15(ers softw)10(are freely)]TJ T* 0.1741 Tw [(redistrib)20(utable in source form.)-674.2(When Sleep)10(ycat Soft-)]TJ T* 0.0906 Tw [(w)10(are w)10(as formed, we needed to draft a license consis-)]TJ T* 0.2318 Tw [(tent with the cop)10(yright go)15(v)15(erning the e)15(xisting, older)]TJ T* 0.2828 Tw [(softw)10(are. )-250(Because)-532.8(of important dif)25(ferences between)]TJ T* 0.0496 Tw [(the UC Berk)10(ele)15(y)-299.7(cop)10(yright and the GPL, it w)10(as impos-)]TJ T* 0.0884 Tw [(sible for us to use the GPL.)-588.4(A)-338.4(second cop)10(yright, with)]TJ T* 0.087 Tw (terms contradictory to the )Tj /TT9 1 Tf 10.9002 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.087 Tw [(rst, simply w)10(ould not ha)20(v)15(e)]TJ -11.4564 -1.2 TD 0 Tw [(w)10(ork)10(ed.)]TJ 0 -1.62 TD 0.2533 Tw [(Sleep)10(ycat w)10(anted to continue Open Source de)25(v)15(elop-)]TJ 0 -1.2 TD 0.2079 Tw [(ment of Berk)10(ele)15(y)-457.9(D)0(B)-457.9(for se)25(v)15(eral reasons.)-707.8(W)80(e)80( )-79.9(agree)]TJ T* 0.0853 Tw (with Raymond [Raym98] and others that Open Source)Tj T* 0.0763 Tw [(softw)10(are is typically of higher quality than proprietary)65(,)]TJ T* 0.2616 Tw [(binary-only products.)-761.6(Our customers bene)]TJ /TT9 1 Tf 18.1535 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.2617 Tw (t from a)Tj -18.7097 -1.2 TD 0.0983 Tw [(community of de)25(v)15(elopers who kno)25(w)-348.3(and use Berk)10(ele)15(y)]TJ T* 0.1317 Tw [(DB, and can help with application design, deb)20(ugging,)]TJ T* 0.165 Tw [(and performance tuning.)-665(W)40(idespread distrib)20(ution and)]TJ T* 0.1017 Tw [(use of the source code tends to isolate b)20(ugs early)65(,)-351.7(and)]TJ T* 0.0032 Tw (to get )Tj /TT9 1 Tf 2.5059 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.0031 Tw [(x)15(es back into the distrib)20(uted system quickly)65(.)-503.1(A)0(s)]TJ -3.0621 -1.2 TD 0.1053 Tw [(a)-355.3(result, Berk)10(ele)15(y)-355.3(D)0(B)-355.3(i)0(s)-355.3(more reliable.)-605.4(Just as impor)20(-)]TJ T* 0.1195 Tw [(tantly)65(,)-369.5(indi)25(vidual users are able to contrib)20(ute ne)25(w)-369.5(fea-)]TJ T* 0.1056 Tw (tures and performance enhancements, to the bene)Tj /TT9 1 Tf 20.3748 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.1056 Tw (t of)Tj -20.931 -1.2 TD 0.0358 Tw [(e)25(v)25( )275.8(eryone who uses Berk)10(ele)15(y)-285.9(DB. )-250.1(From)-285.9(a)-285.8(b)20(usiness per)20(-)]TJ T* 0.0615 Tw [(specti)25(v)15(e)0(,)-311.5(Open Source and free distrib)20(ution of the soft-)]TJ T* 0.1605 Tw [(w)10(are creates share for us, and gi)25(v)15(e)0(s)-410.5(u)0(s)-410.5(a)-410.5(mark)10(et into)]TJ T* 0.0412 Tw [(which we can sell products and services.)-541.3(Finally)65(,)-291.3(mak-)]TJ T* 0.0147 Tw [(ing the source code freely a)20(v)25(ailable reduces our support)]TJ T* 0.2436 Tw (load, since customers can )Tj /TT9 1 Tf 11.4432 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.2436 Tw (nd and )Tj /TT9 1 Tf 3.431 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD 0.2436 Tw [(x b)20(ugs without)]TJ -15.9865 -1.2 TD 0 Tw [(recourse to us, in man)15(y)-250(cases.)]TJ 24.4 62.7 TD 0.3126 Tw [(T)80(o)80( )-80.1(preserv)15(e)-562.7(the Open Source heritage of the older)]TJ 0 -1.2 TD 0.0504 Tw [(Berk)10(ele)15(y)-300.3(D)0(B)-300.3(code, we drafted a ne)25(w)-300.4(license go)15(v)15(erning)]TJ T* 0.0416 Tw [(the distrib)20(ution of Berk)10(ele)15(y)-291.6(D)0(B)-291.6(2.x. )-250(W)80(e)-291.6(adopted terms)]TJ T* 0.0411 Tw [(from the GPL that mak)10(e)-291.1(i)0(t)-291.1(impossible to turn our Open)]TJ T* 0.1288 Tw [(Source code into proprietary code o)25(wned by someone)]TJ T* 0 Tw (else.)Tj 0 -1.62 TD (Brie)Tj /TT9 1 Tf 1.7217 0 TD (ß)Tj /TT4 1 Tf 0.5562 0 TD 0.068 Tw [(y)65(,)-318(the terms go)15(v)15(erning the use and distrib)20(ution of)]TJ -2.2778 -1.2 TD 0 Tw [(Berk)10(ele)15(y)-250(D)0(B)-250(are:)]TJ 8 0 0 8 328.2 603.6 Tm (•)Tj 10 0 0 10 348.2008 603.6 Tm (your application must be internal to your site, or)Tj 8 0 0 8 328.2 587.4 Tm (•)Tj 10 0 0 10 348.2008 587.4 Tm 0.0611 Tw [(your application must be freely redistrib)20(utable in)]TJ T* 0 Tw (source form, or)Tj 8 0 0 8 328.2 559.2 Tm (•)Tj 10 0 0 10 348.2008 559.2 Tm (you must get a license from us.)Tj -2.5001 -1.62 TD 0.0131 Tw [(F)15(o)0(r)-263.1(customers who prefer not to distrib)20(ute Open Source)]TJ 0 -1.2 TD 0.1492 Tw [(products, we sell licenses to use and e)15(xtend Berk)10(ele)15(y)]TJ T* 0 Tw (DB at a reasonable cost.)Tj 0 -1.62 TD 0.1076 Tw [(W)80(e)80( )-79.9(w)10(ork hard to accommodate the needs of the Open)]TJ 0 -1.2 TD 0.0605 Tw [(Source community)65(.)-560.6(F)15(or e)15(xample, we ha)20(v)15(e)15( )-15(crafted spe-)]TJ T* 0.1415 Tw (cial licensing arrangements with Gnome to encourage)Tj T* 0 Tw [(its use and distrib)20(ution of Berk)10(ele)15(y)-250(DB.)]TJ 0 -1.62 TD 0.1603 Tw [(Berk)10(ele)15(y)-410.3(D)0(B)-410.3(conforms to the Open Source de)]TJ /TT9 1 Tf 19.5087 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD (nition)Tj -20.0649 -1.2 TD 0.2367 Tw [([Open99]. )-250(The)-486.7(license has been carefully crafted to)]TJ T* 0.0642 Tw [(k)10(eep the product a)20(v)25(ailable as an Open Source of)25(fering,)]TJ T* 0 Tw [(while pro)15(viding enough of a return on our in)40(v)15(estment to)]TJ T* 0.1546 Tw [(fund continued de)25(v)15(elopment and support of the prod-)]TJ T* 0.0534 Tw [(uct. )-249.9(The)-303.3(current license has created a b)20(usiness capable)]TJ T* 0.0916 Tw [(of funding three years of de)25(v)15(elopment on the softw)10(are)]TJ T* 0 Tw [(that simply w)10(ould not ha)20(v)15(e)15( )-15(happened otherwise.)]TJ /TT2 1 Tf 12 0 0 12 323.2 336.6 Tm 0.25 Tw (7. Summary)Tj /TT4 1 Tf 10 0 0 10 323.2 320.4 Tm 0.0491 Tw [(Berk)10(ele)15(y)-299.1(D)0(B)-299.1(o)0(f)25(fers a unique collection of features, tar)20(-)]TJ T* 0.0174 Tw [(geted squarely at softw)10(are de)25(v)15(elopers who need simple,)]TJ T* 0.0492 Tw (reliable database management services in their applica-)Tj T* 0.28 Tw [(tions. )-250(Good)-530(design and implementation and careful)]TJ T* 0.1633 Tw [(engineering throughout mak)10(e)-413.3(the softw)10(are better than)]TJ T* 0 Tw [(man)15(y)-250(other systems.)]TJ 0 -1.62 TD 0.16 Tw [(Berk)10(ele)15(y)-410(D)0(B)-410(i)0(s)-410(a)0(n)-410(Open Source product, a)20(v)25(ailable at)]TJ /TT6 1 Tf 0 -1.2 TD 0 Tw [(www)74(.sleepycat.com)]TJ /TT4 1 Tf 8.1286 0 TD 0.0654 Tw [(for do)25(wnload. )-250(The)-315.4(distrib)20(uted sys-)]TJ -8.1286 -1.2 TD 0.0382 Tw [(tem includes e)25(v)15(erything needed to b)20(uild and deplo)10(y)-288.2(the)]TJ T* 0 Tw [(softw)10(are or to port it to ne)25(w)-250(systems.)]TJ 0 -1.62 TD 0.2633 Tw [(Sleep)10(ycat Softw)10(are distrib)20(utes Berk)10(ele)15(y)-513.3(D)0(B)-513.4(under a)]TJ 0 -1.2 TD 0.0764 Tw [(license agreement that dra)15(ws on both the UC Berk)10(ele)15(y)]TJ T* 0.2377 Tw [(cop)10(yright and the GPL.)-737.7(The license guarantees that)]TJ T* 0.0884 Tw [(Berk)10(ele)15(y)-338.4(D)0(B)-338.4(will remain an Open Source product and)]TJ T* 0.1493 Tw [(pro)15(vides Sleep)10(ycat with opportunities to mak)10(e)-399.4(mone)15(y)]TJ T* 0 Tw [(to fund continued de)25(v)15(elopment on the softw)10(are.)]TJ ET endstream endobj 32 0 obj << /ProcSet [/PDF /Text ] /Font << /TT2 4 0 R /TT4 5 0 R /TT6 6 0 R /TT9 8 0 R >> /ExtGState << /GS1 9 0 R >> >> endobj 34 0 obj << /Length 3542 >> stream BT /TT2 1 Tf 12 0 0 12 79.2 708 Tm 0 g /GS1 gs 0 Tc 0.25 Tw [(8. Refer)18(ences)]TJ /TT4 1 Tf 10 0 0 10 79.2 691.8 Tm 0 Tw ([Come79])Tj 2.5 -1.2 TD 0.0627 Tw [(Comer)40(,)-312.7(D., “The Ubiquitous B-tree,)70(”)]TJ /TT6 1 Tf 15.283 0 TD 0 Tw [(A)30(C)0(M)-312.6(Com-)]TJ -15.283 -1.2 TD 0.0404 Tw [(puting Surve)30(ys)]TJ /TT4 1 Tf 6.2163 0 TD [(V)129(olume 11, number 2, June 1979.)]TJ -8.7163 -1.62 TD 0 Tw ([Gray93])Tj 2.5 -1.2 TD 0.0482 Tw [(Gray)65(,)-298.2(J., and Reuter)40(,)-298.2(A.,)]TJ /TT6 1 Tf 10.1056 0 TD -0.21 Tw [(T)55(r)55( ansaction )-258.1(Pr)45(ocessing:)]TJ -10.1056 -1.2 TD 0.6776 Tw [(Concepts and T)92(e)0(c)15(hniques)]TJ /TT4 1 Tf 11.5246 0 TD -0.0004 Tc 0 Tw [(,)-928.1(Mor)17.6(gan-Kaufman)]TJ -11.5246 -1.2 TD 0 Tc (Publishers, 1993.)Tj -2.5 -1.62 TD ([IEEE96])Tj 2.5 -1.2 TD 0.0364 Tw (Institute for Electrical and Electronics Engineers,)Tj /TT6 1 Tf T* 0 Tw (IEEE/ANSI Std 1003.1)Tj /TT4 1 Tf 9.082 0 TD [(,)-250(1996 Edition.)]TJ -11.582 -1.62 TD ([Litw80])Tj 2.5 -1.2 TD 0.2365 Tw [(Litwin, W)92(., “Linear Hashing: A Ne)25(w)-486.6(T)80(ool for)]TJ T* 0.1783 Tw [(File and T)80(able Addressing,)70(”)]TJ /TT6 1 Tf 12.0883 0 TD [(Pr)45(oceedings of the)]TJ -12.0883 -1.2 TD 0.4804 Tw [(6th International Confer)37(ence on V)111(ery Lar)37(g)10(e)]TJ T* 0.1983 Tw (Databases \(VLDB\))Tj /TT4 1 Tf 7.8365 0 TD 0.1982 Tw [(,)-448.3(Montreal, Quebec, Canada,)]TJ -7.8365 -1.2 TD 0 Tw (October 1980.)Tj -2.5 -1.62 TD ([Open94])Tj 2.5 -1.2 TD 0.4068 Tw (The Open Group,)Tj /TT6 1 Tf 8.4963 0 TD [(Distrib)20(uted TP: The XA+)]TJ -8.4963 -1.2 TD 0 Tw (Speci)Tj /TT11 1 Tf 2.1655 0 TD (Þ)Tj /TT6 1 Tf 0.5 0 TD 0.078 Tw [(cation, V)111(e)0(r)10(sion 2)]TJ /TT4 1 Tf 6.8954 0 TD [(,)-328(The Open Group, 1994.)]TJ -12.0609 -1.62 TD 0 Tw ([Open99])Tj 2.5 -1.2 TD 0.8307 Tw [(Opensource.or)18(g, “Open Source De)]TJ /TT9 1 Tf 16.3857 0 TD 0 Tw (Þ)Tj /TT4 1 Tf 0.5562 0 TD [(nition,)70(”)]TJ /TT6 1 Tf -16.9419 -1.2 TD [(www)74(.opensour)37(ce)15(.or)37(g/osd.html)]TJ /TT4 1 Tf 12.0318 0 TD 0.063 Tw [(,)-313(v)15(ersion 1.4, 1999.)]TJ -14.5318 -1.62 TD 0 Tw ([Raym98])Tj 2.5 -1.2 TD 0.0718 Tw [(Raymond, E.S., “The Cathedral and the Bazaar)40(,)70(”)]TJ /TT6 1 Tf T* 0 Tw [(www)74(.tuxedo.or)37(g/˜esr/writings/cathedr)15(al-)]TJ T* [(bazaar/cathedr)15(al-bazaar)111(.html)]TJ /TT4 1 Tf 11.9018 0 TD [(,)-250(January 1998.)]TJ -14.4018 -1.62 TD ([Selt91])Tj 2.5 -1.2 TD 0.0078 Tw [(Seltzer)40(,)-257.8(M., and Y)55(igit, O., “)80(A)-257.9(Ne)25(w)25( )-25.1(Hashing P)15(ack-)]TJ T* 0.6704 Tw [(age for UNIX,)70(”)]TJ /TT6 1 Tf 8.4383 0 TD [(Pr)45(oceedings 1991 W)55(inter)]TJ -8.4383 -1.2 TD 0 Tw [(USENIX Confer)37(ence)]TJ /TT4 1 Tf 8.2662 0 TD [(,)-250(Dallas, TX, January 1991.)]TJ -10.7662 -1.62 TD ([Selt92])Tj 2.5 -1.2 TD 0.2865 Tw [(Seltzer)40(,)-536.5(M., and Olson, M., “LIBTP: Portable)]TJ T* 0.2845 Tw [(Modular T)35(ransactions for UNIX,)70(”)]TJ /TT6 1 Tf 14.9456 0 TD 0 Tw [(Pr)45(oceedings)]TJ -14.9456 -1.2 TD 0.149 Tw [(1992 W)55(inter Usenix Confer)37(ence)]TJ /TT4 1 Tf 13.2129 0 TD [(,)-399(San Francisco,)]TJ -13.2129 -1.2 TD 0 Tw (CA, January 1992.)Tj -2.5 -1.62 TD ([Ston82])Tj 2.5 -1.2 TD 0.754 Tw [(Stonebrak)10(er)40(,)-1004(M., Stettner)40(,)-1004(H., Kalash, J.,)]TJ T* 0.0763 Tw [(Guttman, A., and L)55(ynn, N., “Document Process-)]TJ T* 0.0557 Tw [(ing in a Relational Database System,)70(”)-305.6(Memoran-)]TJ T* 0.0825 Tw [(dum No. UCB/ERL M82/32, Uni)25(v)15(ersity of Cali-)]TJ T* 0 Tw [(fornia at Berk)10(ele)15(y)65(,)65( )-65(Berk)10(ele)15(y)65(,)65( )-65(CA, May 1982.)]TJ ET endstream endobj 35 0 obj << /ProcSet [/PDF /Text ] /Font << /TT2 4 0 R /TT4 5 0 R /TT6 6 0 R /TT9 8 0 R /TT11 36 0 R >> /ExtGState << /GS1 9 0 R >> >> endobj 9 0 obj << /Type /ExtGState /SA false /SM 0.02 /OP false /op false /OPM 1 /BG2 /Default /UCR2 /Default /HT /Default /TR2 /Default >> endobj 37 0 obj << /Type /FontDescriptor /Ascent 750 /CapHeight 676 /Descent -250 /Flags 262178 /FontBBox [-168 -218 1000 935] /FontName /Times-Bold /ItalicAngle 0 /StemV 133 /XHeight 461 /StemH 139 >> endobj 38 0 obj << /Type /FontDescriptor /Ascent 750 /CapHeight 662 /Descent -250 /Flags 34 /FontBBox [-168 -218 1000 898] /FontName /Times-Roman /ItalicAngle 0 /StemV 84 /XHeight 450 /StemH 84 >> endobj 39 0 obj << /Type /FontDescriptor /Ascent 750 /CapHeight 653 /Descent -250 /Flags 98 /FontBBox [-169 -217 1010 883] /FontName /Times-Italic /ItalicAngle -15 /StemV 76 /XHeight 441 /StemH 76 >> endobj 40 0 obj << /Type /FontDescriptor /Ascent 753 /CapHeight 562 /Descent -246 /Flags 35 /FontBBox [-28 -250 628 805] /FontName /Courier /ItalicAngle 0 /StemV 51 /XHeight 426 /StemH 51 >> endobj 41 0 obj << /Type /FontDescriptor /Ascent 750 /CapHeight 662 /Descent -250 /Flags 34 /FontBBox [-168 -218 1000 898] /FontName /Times-Roman /ItalicAngle 0 /StemV 84 /XHeight 450 /StemH 84 >> endobj 42 0 obj << /Type /FontDescriptor /Ascent 750 /CapHeight 676 /Descent -250 /Flags 262178 /FontBBox [-168 -218 1000 935] /FontName /Times-Bold /ItalicAngle 0 /StemV 133 /XHeight 461 /StemH 139 >> endobj 43 0 obj << /Type /FontDescriptor /Ascent 750 /CapHeight 653 /Descent -250 /Flags 98 /FontBBox [-169 -217 1010 883] /FontName /Times-Italic /ItalicAngle -15 /StemV 76 /XHeight 441 /StemH 76 >> endobj 4 0 obj << /Type /Font /Subtype /TrueType /FirstChar 32 /LastChar 122 /Widths [250 0 0 0 0 0 0 0 0 0 0 570 0 333 250 0 500 500 500 500 500 500 500 500 500 500 0 0 0 0 0 0 0 722 667 722 722 667 611 0 778 389 500 0 667 944 0 778 611 0 722 556 667 0 0 1000 0 0 0 0 0 0 0 0 0 500 556 444 556 444 333 500 556 278 0 556 278 833 556 500 556 0 444 389 333 556 500 722 500 500 444 ] /Encoding /WinAnsiEncoding /BaseFont /Times-Bold /FontDescriptor 37 0 R >> endobj 5 0 obj << /Type /Font /Subtype /TrueType /FirstChar 32 /LastChar 151 /Widths [250 0 0 0 0 0 778 0 333 333 0 564 250 333 250 278 500 500 500 500 500 500 500 500 500 500 278 278 0 0 0 0 0 722 667 667 722 611 556 722 722 333 389 722 611 889 722 722 556 722 667 556 611 722 722 944 722 722 0 333 0 333 0 0 0 444 500 444 500 444 333 500 500 278 278 500 278 778 500 500 500 500 333 389 278 500 500 722 500 500 444 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 333 444 444 350 0 1000 ] /Encoding /WinAnsiEncoding /BaseFont /Times-Roman /FontDescriptor 38 0 R >> endobj 6 0 obj << /Type /Font /Subtype /TrueType /FirstChar 32 /LastChar 152 /Widths [250 0 0 0 0 0 0 0 333 333 0 675 250 333 250 278 500 500 500 500 0 0 500 0 0 500 333 0 0 0 0 0 0 611 611 667 722 611 0 0 0 333 0 0 556 833 667 0 611 0 611 500 556 722 611 833 611 0 0 0 0 0 0 0 0 500 500 444 500 444 278 500 500 278 0 444 278 722 500 500 500 500 389 389 278 500 444 667 444 444 389 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 333 ] /Encoding /WinAnsiEncoding /BaseFont /Times-Italic /FontDescriptor 39 0 R >> endobj 7 0 obj << /Type /Font /Subtype /TrueType /FirstChar 68 /LastChar 117 /Widths [600 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 600 600 600 600 600 600 600 600 0 0 0 0 600 600 600 0 0 600 600 600 600 ] /Encoding /WinAnsiEncoding /BaseFont /Courier /FontDescriptor 40 0 R >> endobj 8 0 obj << /Type /Font /Subtype /TrueType /FirstChar 222 /LastChar 223 /Widths [556 556 ] /Encoding /MacRomanEncoding /BaseFont /Times-Roman /FontDescriptor 41 0 R >> endobj 20 0 obj << /Type /Font /Subtype /TrueType /FirstChar 222 /LastChar 222 /Widths [556 ] /Encoding /MacRomanEncoding /BaseFont /Times-Bold /FontDescriptor 42 0 R >> endobj 36 0 obj << /Type /Font /Subtype /TrueType /FirstChar 222 /LastChar 222 /Widths [500 ] /Encoding /MacRomanEncoding /BaseFont /Times-Italic /FontDescriptor 43 0 R >> endobj 1 0 obj << /Type /Page /Parent 10 0 R /Resources 3 0 R /Contents 2 0 R >> endobj 11 0 obj << /Type /Page /Parent 10 0 R /Resources 13 0 R /Contents 12 0 R >> endobj 14 0 obj << /Type /Page /Parent 10 0 R /Resources 16 0 R /Contents 15 0 R >> endobj 17 0 obj << /Type /Page /Parent 10 0 R /Resources 19 0 R /Contents 18 0 R >> endobj 21 0 obj << /Type /Page /Parent 10 0 R /Resources 23 0 R /Contents 22 0 R >> endobj 24 0 obj << /Type /Page /Parent 10 0 R /Resources 26 0 R /Contents 25 0 R >> endobj 27 0 obj << /Type /Page /Parent 10 0 R /Resources 29 0 R /Contents 28 0 R >> endobj 30 0 obj << /Type /Page /Parent 10 0 R /Resources 32 0 R /Contents 31 0 R >> endobj 33 0 obj << /Type /Page /Parent 10 0 R /Resources 35 0 R /Contents 34 0 R >> endobj 44 0 obj << /S /D >> endobj 45 0 obj << /Nums [0 44 0 R ] >> endobj 10 0 obj << /Type /Pages /Kids [1 0 R 11 0 R 14 0 R 17 0 R 21 0 R 24 0 R 27 0 R 30 0 R 33 0 R] /Count 9 /MediaBox [0 0 612 792] >> endobj 46 0 obj << /CreationDate (D:20090603204335-07'00') /ModDate (D:20090603204335-07'00') /Producer (Apple pstopdf) >> endobj 47 0 obj << /Type /Catalog /Pages 10 0 R /PageLabels 45 0 R >> endobj xref 0 48 0000000000 65535 f 0000079461 00000 n 0000000016 00000 n 0000007432 00000 n 0000077089 00000 n 0000077550 00000 n 0000078120 00000 n 0000078650 00000 n 0000078945 00000 n 0000075560 00000 n 0000080282 00000 n 0000079542 00000 n 0000007571 00000 n 0000016356 00000 n 0000079626 00000 n 0000016474 00000 n 0000025605 00000 n 0000079710 00000 n 0000025745 00000 n 0000034766 00000 n 0000079119 00000 n 0000079794 00000 n 0000034897 00000 n 0000044020 00000 n 0000079878 00000 n 0000044149 00000 n 0000053503 00000 n 0000079962 00000 n 0000053632 00000 n 0000062678 00000 n 0000080046 00000 n 0000062818 00000 n 0000071694 00000 n 0000080130 00000 n 0000071823 00000 n 0000075418 00000 n 0000079289 00000 n 0000075700 00000 n 0000075902 00000 n 0000076099 00000 n 0000076299 00000 n 0000076490 00000 n 0000076687 00000 n 0000076889 00000 n 0000080214 00000 n 0000080242 00000 n 0000080420 00000 n 0000080543 00000 n trailer << /Size 48 /Root 47 0 R /Info 46 0 R /ID [] >> startxref 80613 %%EOF