CSCE 824 – Secure Database Systems

Spring 2019

 

Lecture Notes

Jan. 15.      Introduction and Relational databases overview

Review:  CSCE 520 lecture notes,  https://cse.sc.edu/~farkas/csce520-2015/csce520.htm

Jan. 17.     Overview of Relational Data Models and Distributed Databases (slides)

Reading:

1.       A. El Abbadi and S. Toueg. 1989. Maintaining availability in partitioned replicated databases. ACM Trans. Database Syst. 14, 2 (June 1989), 264-290., https://dl.acm.org/citation.cfm?id=63501

Interesting Reading:

2.       Uwe Röhm, Michael J. Cahill, Alan Fekete, Hyungsoo Jung, Seung Woo Baek, and Mathew Rodley. 2013. Robust snapshot replication. In Proceedings of the Twenty-Fourth Australasian Database Conference - Volume 137 (ADC '13), Hua Wang and Rui Zhang (Eds.), Vol. 137. Australian Computer Society, Inc., Darlinghurst, Australia, Australia, 81-91. https://dl.acm.org/citation.cfm?id=2525425

3.       Tudor-Ioan Salomie, Ionut Emanuel Subasu, Jana Giceva, and Gustavo Alonso. 2011. Database engines on multicores, why parallelize when you can distribute?. In Proceedings of the sixth conference on Computer systems (EuroSys '11). ACM, New York, NY, USA, 17-30. https://dl.acm.org/citation.cfm?id=1966448

 

Jan. 22.      Big Data Analytics, HADOOP Architecture

Reading:

1.       HDFS Architecture Guide, https://hadoop.apache.org/docs/r1.2.1/hdfs_design.pdf

2.       Hive – Introduction, https://www.tutorialspoint.com/hive/hive_introduction.htm

 

Jan. 24-29.       Security Primer (slides)

Reading: CSCE 522 lecture notes, https://cse.sc.edu/~farkas/csce522/csce522.htm

 

Jan. 31.      Access Control Models (slides)

Reading:

1.       S. De Capitani di Vimercati, P. Samarati, S. Jajodia: Policies, Models, and Languages for Access Control, in Databases in Networked Information Systems, Volume 3433 of the series Lecture Notes in Computer Science pp 225-237,  http://spdp.di.unimi.it/papers/2005-DNIS.pdf

 

Febr. 5.      Properties of Access Control Models (slides)

Reading:

1.       Charles Morisset and Nicola Zannone. 2014. Reduction of access control decisions. In Proceedings of the 19th ACM symposium on Access control models and technologies (SACMAT '14). ACM, New York, NY, USA, 53-62. https://dl.acm.org/citation.cfm?id=2613106

 

Febr. 7.      The Inference Problem: General Inference Problem (slides), web inferences (slides3), Statistical databases (slides3)

Reading:

1.       Davide Alberto Albertini, Barbara Carminati, and Elena Ferrari. 2017. An extended access control mechanism exploiting data dependencies. Int. J. Inf. Secur. 16, 1 (February 2017), 75-89. , https://link.springer.com/article/10.1007/s10207-016-0322-4

2.       L. Sweeney. Weaving Technology and Policy Together to Maintain Confidentiality. Journal of Law, Medicine & Ethics, 25, nos. 2&3 (1997): 98-110. (http://onlinelibrary.wiley.com/doi/10.1111/j.1748-720X.1997.tb01885.x/epdf )

 

 

Febr. 12.    Big Data Access Control (slides)

Reading:

1.       L. Sweeney. Weaving Technology and Policy Together to Maintain Confidentiality. Journal of Law, Medicine & Ethics, 25, nos. 2&3 (1997): 98-110. (http://onlinelibrary.wiley.com/doi/10.1111/j.1748-720X.1997.tb01885.x/epdf  )

2.       Amin Beheshti, Boualem Benatallah, Reza Nouri, Van Munin Chhieng, HuangTao Xiong, and Xu Zhao. 2017. CoreDB: a Data Lake Service. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (CIKM '17). ACM, New York, NY, USA, 2451-2454. https://dl.acm.org/citation.cfm?id=3133171

3.       Mina Farid, Alexandra Roatis, Ihab F. Ilyas, Hella-Franziska Hoffmann, and Xu Chu. 2016. CLAMS: Bringing Quality to Data Lakes. In Proceedings of the 2016 International Conference on Management of Data (SIGMOD '16). ACM, New York, NY, USA, 2089-2092., https://dl.acm.org/citation.cfm?id=2899391

 

Febr. 17.    Data Provenance (slides)

Reading:

1.       Peter Buneman and Wang-Chiew Tan. 2007. Provenance in databases. In Proceedings of the 2007 ACM SIGMOD international conference on Management of data (SIGMOD '07). ACM, New York, NY, USA, 1171-1173.

Interesting read (not required):

1.       Yael Amsterdamer, Susan B. Davidson, Daniel Deutch, Tova Milo, Julia Stoyanovich, and Val Tannen. 2011. Putting lipstick on pig: enabling database-style workflow provenance. Proc. VLDB Endow. 5, 4 (December 2011), 346-357. https://dl.acm.org/citation.cfm?id=2095693

2.       Eleanor Ainy, Pierre Bourhis, Susan B. Davidson, Daniel Deutch, and Tova Milo. 2015. Approximated Summarization of Data Provenance. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management (CIKM '15). ACM, New York, NY, USA, 483-492. https://dl.acm.org/citation.cfm?id=2806429

 

 

Febr. 19 – 21.  No Classes à Work on Project1

 

Febr. 26.    Data Provenance (slides) cont.

Reading:

1.       Melanie Herschel, Ralf Diestelkämper, and Houssem Ben Lahmar. 2017. A survey on provenance: What for? What form? What from?. The VLDB Journal 26, 6 (December 2017), 881-906. https://dl.acm.org/citation.cfm?id=3159194

Interesting read (not required):

3.       Muhammad Naveed Aman, Kee Chaing Chua, and Biplab Sikdar. 2017. Secure Data Provenance for the Internet of Things. In Proceedings of the 3rd ACM International Workshop on IoT Privacy, Trust, and Security (IoTPTS '17). ACM, New York, NY, USA, 11-14., https://dl.acm.org/citation.cfm?id=3055255

4.       Matteo Interlandi, Ari Ekmekji, Kshitij Shah, Muhammad Ali Gulzar, Sai Deep Tetali, Miryung Kim, Todd Millstein, and Tyson Condie. 2018. Adding data provenance support to Apache Spark. The VLDB Journal 27, 5 (October 2018), 595-615. https://dl.acm.org/citation.cfm?id=3283005

 

Febr. 28.    XML and XML Security (slides)

Reading:

1.       XML Primer, W3C, http://www.w3c.it/education/2012/upra/documents/xmlprimer.pdf

2.       Ernesto Damiani, Sabrina De Capitani di Vimercati, Stefano Paraboschi, and Pierangela Samarati. 2002. A fine-grained access control system for XML documents. ACM Trans. Inf. Syst. Secur. 5, 2 (May 2002), 169-202., https://dl.acm.org/citation.cfm?id=505590

 

March 5.   XML Database (slides)

Reading:

1.       XML Primer, W3C, http://www.w3c.it/education/2012/upra/documents/xmlprimer.pdf

2.       Ernesto Damiani, Sabrina De Capitani di Vimercati, Stefano Paraboschi, and Pierangela Samarati. 2002. A fine-grained access control system for XML documents. ACM Trans. Inf. Syst. Secur. 5, 2 (May 2002), 169-202., https://dl.acm.org/citation.cfm?id=505590

 

March 7.   XML normalization (slides)

Reading:

1.       Cong Yu and H. V. Jagadish. 2008. XML schema refinement through redundancy detection and normalization. The VLDB Journal 17, 2 (March 2008), 203-223. https://dl.acm.org/citation.cfm?id=1342417

2.       Millist W. Vincent, Jixue Liu, and Chengfei Liu. 2004. Strong functional dependencies and their application to normal forms in XML. ACM Trans. Database Syst. 29, 3 (September 2004), 445-462. https://dl.acm.org/citation.cfm?id=1016029

3.       Serge Abiteboul, Georg Gottlob, and Marco Manna. 2009. Distributed XML design. In Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems (PODS '09). ACM, New York, NY, USA, 247-258. https://dl.acm.org/citation.cfm?id=1559833

 

March 12-14:  Spring Break

 

March 19. XML Inferences (slides)

Reading:

1.       Andrei Stoica and Csilla Farkas. 2004. Ontology guided XML security engine. J. Intell. Inf. Syst. 23, 3 (November 2004), 209-223., https://cse.sc.edu/~farkas/papers/journal13.pdf

 

March 21 . Streaming Data (slides) – Theppatorn Rhujittawiwat [1]

            26         Reading:

1.       Daniel J. Abadi, Don Carney, Ugur Çetintemel, Mitch Cherniack, Christian Convey, Sangdon Lee, Michael Stonebraker, Nesime Tatbul, and Stan Zdonik. 2003. Aurora: a new model and architecture for data stream management. The VLDB Journal 12, 2 (August 2003), 120-139. https://dl.acm.org/citation.cfm?id=950485

2.       Barbara Carminati, Elena Ferrari, Jianneng Cao, and Kian Lee Tan. 2010. A framework to enforce access control over data streams. ACM Trans. Inf. Syst. Secur. 13, 3, Article 28 (July 2010), 31 pages. https://dl.acm.org/citation.cfm?id=1805984

 

March 28.       Cloud Databases (slides) 

Microsoft Azure (slides) – Josh Gregory [2]

Reading:

1.       Jun Tang, Yong Cui, Qi Li, Kui Ren, Jiangchuan Liu, and Rajkumar Buyya. 2016. Ensuring Security and Privacy Preservation for Cloud Data Services. ACM Comput. Surv. 49, 1, Article 13 (June 2016), 39 pages. https://dl.acm.org/citation.cfm?id=2906153

2.       Microsoft Azure, https://azure.microsoft.com/en-us/get-started/

 

April 2.            Data Analytics (slides)

Context Matters: How software vulnerabilities impact data security (slides) – Kimberly Redmond [2]

Reading:

1.       Latifur Khan. 2018. Big IoT Data Stream Analytics with Issues in Privacy and Security. In Proceedings of the Fourth ACM International Workshop on Security and Privacy Analytics (IWSPA '18). ACM, New York, NY, USA, 22-22., https://dl.acm.org/citation.cfm?id=3180455

2.       Kang, Boojoong, et al. "Malware classification method via binary content comparison." Proceedings of the 2012 ACM Research in Applied Computation Symposium. ACM, 2012. https://dl-acm-org.pallas2.tcl.sc.edu/citation.cfm?id=2401672

                        Interesting:

3.       Zuo, Fei, et al. "Neural machine translation inspired binary code similarity comparison beyond function pairs." arXiv preprint arXiv:1808.04706 (2018). https://arxiv.org/abs/1808.04706

4.        Seyed Mohammad Ghaffarian and Hamid Reza Shahriari. 2017. Software Vulnerability Analysis and Discovery Using Machine-Learning and Data-Mining Techniques: A Survey. ACM Comput. Surv. 50, 4, Article 56 (August 2017), 36 pages. https://dl.acm.org/citation.cfm?id=3092566   

 

April 4.            Privacy Preserving Cloud Computing (slides) – Fengyao Yan [1]

                        Privacy and DM Applications (slides) – Saivenkatanikhil Nimmagadda [2]

                        Big Data Credibility (slides) – Salhulding Alquarghuli [3]

Reading:

1.       Xun Yi, Fang-Yu Rao, Elisa Bertino, and Athman Bouguettaya. 2015. Privacy-Preserving Association Rule Mining in Cloud Computing. In Proceedings of the 10th ACM Symposium on Information, Computer and Communications Security (ASIA CCS '15). ACM, New York, NY, USA, 439-450. https://dl.acm.org/citation.cfm?id=2714603

2.       Ashwin Machanavajjhala, Daniel Kifer, Johannes Gehrke, and Muthuramakrishnan Venkitasubramaniam. 2007. L-diversity: Privacy beyond k-anonymity. ACM Trans. Knowl. Discov. Data 1, 1, Article 3 (March 2007). https://dl.acm.org/citation.cfm?id=1217302

3.       Shi Zhi, Yicheng Sun, Jiayi Liu, Chao Zhang, and Jiawei Han. 2017. ClaimVerif: A Real-time Claim Verification System Using the Web and Fact Databases. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (CIKM '17). ACM, New York, NY, USA, 2555-2558. https://dl.acm.org/citation.cfm?id=3133182

 

April 9.            Database Intrusion Detection (slides) – Matthew Heightland [1]

                        Use of Provenance for Intrusion Detection (slides) – Rohit Naini [2]

                        Data Analytics for Attack Detection (slides) – Xiya Xia [3]

Reading:

1.       Mohammad Saiful Islam, Mehmet Kuzu, and Murat Kantarcioglu. 2015. A Dynamic Approach to Detect Anomalous Queries on Relational Databases. In Proceedings of the 5th ACM Conference on Data and Application Security and Privacy (CODASPY '15). ACM, New York, NY, USA, 245-252. https://dl.acm.org/citation.cfm?id=2699120

2.       Ragib Hasan, Radu Sion, and Marianne Winslett. 2009. Preventing history forgery with secure provenance. Trans. Storage 5, 4, Article 12 (December 2009), 43 pages. https://dl.acm.org/citation.cfm?id=1629082

3.       Peng Gao, Xusheng Xiao, Zhichun Li, Kangkook Jee, Fengyuan Xu, Sanjeev R Kulkarni, and Prateek Mittal. Aiql: enabling efficient attack investigation from system monitoring data. In Proceedings of the 2018 USENIX Conference on Usenix Annual Technical Conference, pages 113–125. USENIX Association, 2018. https://www.usenix.org/system/files/conference/atc18/atc18-gao.pdf

 

April 11.          Data security needs in high performance computing HPC (slides) [1] 

                        Data Provenance (slides) – Denise Davis [2]

                        Identity Management (slides) – Marc Bowman [3]

Reading:

1.       Melanie Herschel, Ralf Diestelkämper, and Houssem Ben Lahmar. 2017. A survey on provenance: What for? What form? What from?. The VLDB Journal 26, 6 (December 2017), 881-906. https://dl.acm.org/citation.cfm?id=3159194 

2.       Matteo Interlandi, Ari Ekmekji, Kshitij Shah, Muhammad Ali Gulzar, Sai Deep Tetali, Miryung Kim, Todd Millstein, and Tyson Condie. 2018. Adding data provenance support to Apache Spark. The VLDB Journal 27, 5 (October 2018), 595-615. https://dl.acm.org/citation.cfm?id=3283005

3.       Susmita Horrow and Anjali Sardana. 2012. Identity management framework for cloud based internet of things. In Proceedings of the First International Conference on Security of Internet of Things (SecurIT '12). ACM, New York, NY, USA, 200-203. https://dl.acm.org/citation.cfm?id=2490456

 

April 16.          Cloud applications (slides) – Harrison Howell [1] 

Privacy and Machine Learning (slides) – Nick Rhodes [2]

Identity Management (slides) – Marc Bowman [3]

Reading:

1.       Vlado Stankovski, Salman Taherizadeh, Ian Taylor, Andrew Jones, Bruce Becker, Carlo Mastroianni, and Heru Suhartanto. 2015. Towards an environment supporting resilience, high-availability, reproducibility and reliability for cloud applications. In Proceedings of the 8th International Conference on Utility and Cloud Computing (UCC '15). IEEE Press, Piscataway, NJ, USA, 383-386. DOI: https://doi.org/10.1109/UCC.2015.61

2.       N. Papernot, P. McDaniel, A. Sinha and M. P. Wellman, "SoK: Security and Privacy in Machine Learning," 2018 IEEE European Symposium on Security and Privacy (EuroS&P), London, 2018, pp. 399-414. http://www-personal.umich.edu/~arunesh/Files/Other/Papers/18-eurosp-adv-ml-sok.pdf

3.       Susmita Horrow and Anjali Sardana. 2012. Identity management framework for cloud based internet of things. In Proceedings of the First International Conference on Security of Internet of Things (SecurIT '12). ACM, New York, NY, USA, 200-203. https://dl.acm.org/citation.cfm?id=2490456

 

April 18.          Internet of Things and Access Control (slides) – Andrew Cox [1]

                        Threats to Privacy in the Forensic Analysis of Database Systems (slides) – Andrew Michels [2]

                        Data Provenance (slides) – Denise Davis [3]

Reading:

1.       Bertin, Emmanuel, et al. "Access Control in the Internet of Things: a Survey of Existing Approaches and Open Research Questions." Annals of Telecommunications, 2019. https://link.springer.com/article/10.1007/s12243-019-00709-7

2.       Stahlberg, Patrick, et al. “Threats to Privacy in the Forensic Analysis of Database Systems.” Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data -SIGMOD '07, 2007. https://dl.acm.org/citation.cfm?id=1247492

3.       Matteo Interlandi, Ari Ekmekji, Kshitij Shah, Muhammad Ali Gulzar, Sai Deep Tetali, Miryung Kim, Todd Millstein, and Tyson Condie. 2018. Adding data provenance support to Apache Spark. The VLDB Journal 27, 5 (October 2018), 595-615. https://dl.acm.org/citation.cfm?id=3283005

 

 

April 23.          Anomalous Database Transaction detection (slides)  -- Harshith Reddy Sarabudla [1]

                        Medical data privacy (slides1, slides2) – Jiexi Wang [2]

Reading:

1.       Syed Rafiul Hussain, Asmaa M. Sallam, Elisa Bertino, “DetAnom: Detecting Anomalous Database Transactions by Insiders”. 5th ACM Conference on Data and Application Security and Privacy 2015. https://dl.acm.org/citation.cfm?id=2699111

2.       M. Marwan, A. Kartit, and H. Ouahmane. 2017. Design a Secure Framework for Cloud-Based Medical Image Storage. In Proceedings of the 2nd international Conference on Big Data, Cloud and Applications (BDCA'17). ACM, New York, NY, USA, Article 7, 6 pages. https://dl.acm.org/citation.cfm?id=3090361

April 25.          Review Lecture and Dark Data and Cybersecurity (slides)

Reading:

1.       Ce Zhang, Jaeho Shin, Christopher , Michael Cafarella, and Feng Niu. 2016. Extracting Databases from Dark Data with DeepDive. In Proceedings of the 2016 International Conference on Management of Data (SIGMOD '16). ACM, New York, NY, USA, 847-859. https://dl.acm.org/citation.cfm?id=2904442

 

May 7. 4:00 pm – 6:30 pm

Final Project Presentations

1.       Salhuldin

2.       Bowman

3.       Cox

4.       Gregory

5.       Heightland

6.       Howell

7.       Michels

8.       Naini

9.       Nimmagada

10.   Redmond

11.   Theppatorn

12.   Sarabudla

13.   Wang

14.   Xia

15.   Yan

 

 

Logistics:

1.       Upload your FINAL Report by May 5, 23:55 pm.   Note, all reports will be posted in dropbox for the rest of the class.

2.       Upload your 4 minutes presentation to Dropbox by May 6, 23:55 pm

 

On May 7, 2019

1.       I post the presentations on the class’ website before the class

2.       Each student will have exactly 4 minutes to present their work (4:00 pm -5:5:10 pm)

3.       We will use open voting for each project to rank them (5:10 – 5:30 pm) – you can promote your project

4.       Discussion on projects and identifying future possibilities (5:30 – 6:00 pm)

5.       Revisit ranking to select final top 3 (6:00 – 6:30 pm)