Assignment 4: External Hashing
VerifiedAdded on 2019/09/18
|3
|1557
|446
Project
AI Summary
This assignment involves creating a system for storing student information using an external hashing structure. The system utilizes an in-memory hash table and a random access file to simulate external storage. Students must implement basic operations: insert, search, and delete. The hash table uses block numbers to index blocks in the external file, and linear probing is used within each block. The assignment includes detailed instructions on implementing the `IndexHashTable` class, handling file I/O, and managing record insertion, deletion, and searching. Error handling for full blocks and duplicate IDs is also required. The project requires the use of provided `Record` and `RandomFile` classes and culminates in an application class (`ExternalHashing`) that provides a user interface for interacting with the system.

Assignment 4 (Hash table and external storage):
Problem to be solved:
Implement the external hashing structure to store the students’ information for a
department. Provide basic operations: insert, search and delete to the system.
Assumptions:
The data structure is implemented by an in-memory hash table and a random
access file as an external-memory. The file is for simulating the external storage
of a disc. Assume the file consists of n disc blocks and each block contains r
records maximally.
The number of total records need to be stored is not more than (n * r) / 2.
Each record is defined as follow:
It consists of a name and an id, both are String type. The id is a key which
uniquely represents a record.
The read operation reads the record from a file
The write operation writes the record into the file
Each element of the in-memory hash table contains a block number which is an
index of a block in the external file. Refer the fig. 11.15 in p. 572 in the textbook.
Once the external block is identified, use the linear probing to find the element
location inside of the external blocks (for all operations). The linear probing here
means a sequential search record-by-record within a block.
Do not forget the special consideration of deleting on the effect of other
operations.
When user chooses to quit from the program, display the number of records in
each block at the time.
Detail guides:
1. Implement an Index Hash Table by a class named IndexHashTable. Each element
of the table contains a block number (int) in the range of 0 to n-1. Each block
number is an index of the block in the external file. For example, if one block size is
300 bytes and the index is 6, the offset (the relative address to the beginning of the
file) of the block 6 begin address should be 6*300=1800 byte in the file. In another
word, the block 6 starts from byte 1800 and ends at byte 2099 in the file.
2. To simulate the randomly distributed block numbers in the hash table, 0 to n-1
indexes should be randomly distributed in the hash table. These numbers should be
inserted in to the table when the table is initialized (in the constructor of
IndexHashTable class). The block numbers should be unique in the table.
3. Create a file to store the data (The instructor will provide a sample code. See 5.2.
below)
4. The operations in the hash table are insert, delete and search. To guide your
implementation, here is an algorithm and some discussions on programming issues
for the operation “Insert”:
Problem to be solved:
Implement the external hashing structure to store the students’ information for a
department. Provide basic operations: insert, search and delete to the system.
Assumptions:
The data structure is implemented by an in-memory hash table and a random
access file as an external-memory. The file is for simulating the external storage
of a disc. Assume the file consists of n disc blocks and each block contains r
records maximally.
The number of total records need to be stored is not more than (n * r) / 2.
Each record is defined as follow:
It consists of a name and an id, both are String type. The id is a key which
uniquely represents a record.
The read operation reads the record from a file
The write operation writes the record into the file
Each element of the in-memory hash table contains a block number which is an
index of a block in the external file. Refer the fig. 11.15 in p. 572 in the textbook.
Once the external block is identified, use the linear probing to find the element
location inside of the external blocks (for all operations). The linear probing here
means a sequential search record-by-record within a block.
Do not forget the special consideration of deleting on the effect of other
operations.
When user chooses to quit from the program, display the number of records in
each block at the time.
Detail guides:
1. Implement an Index Hash Table by a class named IndexHashTable. Each element
of the table contains a block number (int) in the range of 0 to n-1. Each block
number is an index of the block in the external file. For example, if one block size is
300 bytes and the index is 6, the offset (the relative address to the beginning of the
file) of the block 6 begin address should be 6*300=1800 byte in the file. In another
word, the block 6 starts from byte 1800 and ends at byte 2099 in the file.
2. To simulate the randomly distributed block numbers in the hash table, 0 to n-1
indexes should be randomly distributed in the hash table. These numbers should be
inserted in to the table when the table is initialized (in the constructor of
IndexHashTable class). The block numbers should be unique in the table.
3. Create a file to store the data (The instructor will provide a sample code. See 5.2.
below)
4. The operations in the hash table are insert, delete and search. To guide your
implementation, here is an algorithm and some discussions on programming issues
for the operation “Insert”:
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

The signature of the insert method is:
public int insert (RandomAccessFile file, Record rec)
It returns 0 for successful, -1 for block full and 1 for redundancy (more details
follow) The possible steps for insert:
(to describe the algorithm conveniently, assume the table is hashTbl [ ])
1) For a given record (input by user), use the id of the record as the original
key. Note that the key is an integer converted from the string id.
2) Apply the hash function (p.536) to the original key to obtain an index i,
which is used to locate a cell in the in-memory Hash Table hashTbl. The
table size n is decided by the number of blocks n which should be
obtained from the end user (do not multiply n by 2 in your hash function
because our strategy is more like separate chaining).
3) Retrieve the block index from hashTbl [i].
4) Calculate the start address (the offset) of the block in the external file:
block-index * blockSize (e.g. I * 300 in previous example). The blockSize
is recSize * r * 2. The r is the maximum number of records allowed in one
block provided by the user and 2 is for enlarging the capacity in each
block to avoid the full of block.
5) Use the address obtained in 4) to locate the position of the block in the file
(use the seek() method which can be found in the API of the
RandomAccessFile.java. The usage is also shown in the sample code
given by the instructor.)
6) Use the linear probe approach to insert the record in the block.
7) There are three possible return values:
0 – normal exit. The record has been inserted
1 – ID redundancy. During the probing, make sure that there is
no redundant key found (there is no any existing record with the
same key as the key of the record for inserting). If find
redundancy, do not do the insertion.
-1 – The block is full. If the number of probes reaches r and the
available space is not found in the block, the operation fails (the
block is full. This situation should be rear which indicates that
the hash function itself has a poor performance).
Start from step 5), the operations of the random access file will be involved.
The methods for read/write the file are provided by the instructor. See 5.
5. The classes provided by instructor:
1) Record class
This class defines a record required in the problem, includes methods to write
the record into a specified random access file and read a record from the file.
Class field:
o id – It is String type and limited to 4 chars. The range of id is 1 to 500.
(Giving 4 digits id is for extending purpose)
o name – It is String type and limited to 16 chars
public int insert (RandomAccessFile file, Record rec)
It returns 0 for successful, -1 for block full and 1 for redundancy (more details
follow) The possible steps for insert:
(to describe the algorithm conveniently, assume the table is hashTbl [ ])
1) For a given record (input by user), use the id of the record as the original
key. Note that the key is an integer converted from the string id.
2) Apply the hash function (p.536) to the original key to obtain an index i,
which is used to locate a cell in the in-memory Hash Table hashTbl. The
table size n is decided by the number of blocks n which should be
obtained from the end user (do not multiply n by 2 in your hash function
because our strategy is more like separate chaining).
3) Retrieve the block index from hashTbl [i].
4) Calculate the start address (the offset) of the block in the external file:
block-index * blockSize (e.g. I * 300 in previous example). The blockSize
is recSize * r * 2. The r is the maximum number of records allowed in one
block provided by the user and 2 is for enlarging the capacity in each
block to avoid the full of block.
5) Use the address obtained in 4) to locate the position of the block in the file
(use the seek() method which can be found in the API of the
RandomAccessFile.java. The usage is also shown in the sample code
given by the instructor.)
6) Use the linear probe approach to insert the record in the block.
7) There are three possible return values:
0 – normal exit. The record has been inserted
1 – ID redundancy. During the probing, make sure that there is
no redundant key found (there is no any existing record with the
same key as the key of the record for inserting). If find
redundancy, do not do the insertion.
-1 – The block is full. If the number of probes reaches r and the
available space is not found in the block, the operation fails (the
block is full. This situation should be rear which indicates that
the hash function itself has a poor performance).
Start from step 5), the operations of the random access file will be involved.
The methods for read/write the file are provided by the instructor. See 5.
5. The classes provided by instructor:
1) Record class
This class defines a record required in the problem, includes methods to write
the record into a specified random access file and read a record from the file.
Class field:
o id – It is String type and limited to 4 chars. The range of id is 1 to 500.
(Giving 4 digits id is for extending purpose)
o name – It is String type and limited to 16 chars

[Note: Each char in Unicode is two bytes. When you work with the
object of the RandomAccessFile in which the operations apply on
bytes, you should count the offset for each record by doubling the size
of the chars in a record. In another word, the record defined here
consists of 16 + 4 = 20 chars which are 40 bytes. This 40 will be used
by linear probing when you want to move to the next record.]
Constructor:
public record (String name, String id) –
create a new record
Public Methods:
public void read (RandomAccessFile file) –
to retrieve the name and id from the file at the current file pointer
position
public void write (RandomAccessFile file) –
to write name and id in the file at the current position
public String getId( ) –
to return the id of a record
public String getName( ) –
to return the name in a record
public String toString( ) – to display the name and id in the record
2) RandomFile class
This is a demo program. It shows how to create a random access file (which
may work as an external disk file in your project), how to use the write and
read operations defined in the Record class, how to locate a block by seek() in
the file, and how to close the file. It is an application class containing main().
You may run this class with Record class to see what happens to the created
external file. The file is created under the same directory with the classes you
run.
6. Write an application class named ExternalHashing to provide an interface for users
to use the system. In the operation menu, there are following items, Insert, Delete,
Search and Quit..
7. Your program should be general to accept any values for n (the size of the in-
memory hash table and also the number of blocks in the file) and r (number of
record allowed in each block and will be doubled internally) from the end user. You
may test your program by entering 5 for n and 7 for r (use primes).
8. The following classes should be included in your submission (in one zip file only):
Record.java – represents one record (from the instructor)
IndexHashTable.java – represents the hash table including the operations and other
help methods if there are any
ExternalHashing.java – the client class displaying the screen title and operations
menu to the user and do all user/system interactions
object of the RandomAccessFile in which the operations apply on
bytes, you should count the offset for each record by doubling the size
of the chars in a record. In another word, the record defined here
consists of 16 + 4 = 20 chars which are 40 bytes. This 40 will be used
by linear probing when you want to move to the next record.]
Constructor:
public record (String name, String id) –
create a new record
Public Methods:
public void read (RandomAccessFile file) –
to retrieve the name and id from the file at the current file pointer
position
public void write (RandomAccessFile file) –
to write name and id in the file at the current position
public String getId( ) –
to return the id of a record
public String getName( ) –
to return the name in a record
public String toString( ) – to display the name and id in the record
2) RandomFile class
This is a demo program. It shows how to create a random access file (which
may work as an external disk file in your project), how to use the write and
read operations defined in the Record class, how to locate a block by seek() in
the file, and how to close the file. It is an application class containing main().
You may run this class with Record class to see what happens to the created
external file. The file is created under the same directory with the classes you
run.
6. Write an application class named ExternalHashing to provide an interface for users
to use the system. In the operation menu, there are following items, Insert, Delete,
Search and Quit..
7. Your program should be general to accept any values for n (the size of the in-
memory hash table and also the number of blocks in the file) and r (number of
record allowed in each block and will be doubled internally) from the end user. You
may test your program by entering 5 for n and 7 for r (use primes).
8. The following classes should be included in your submission (in one zip file only):
Record.java – represents one record (from the instructor)
IndexHashTable.java – represents the hash table including the operations and other
help methods if there are any
ExternalHashing.java – the client class displaying the screen title and operations
menu to the user and do all user/system interactions
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide
1 out of 3
Your All-in-One AI-Powered Toolkit for Academic Success.
+13062052269
info@desklib.com
Available 24*7 on WhatsApp / Email
Unlock your academic potential
Copyright © 2020–2025 A2Z Services. All Rights Reserved. Developed and managed by ZUCOL.
