Mission New Farm: Papers published and brief overview

Paper-1: Experimental Evaluation of Network Telemetry Anonymization for Cloud Based Security Analysis

-Format preserving encryption =
Length Preserving encryption +
Ranking functions
-Searchable Encryption:
clients can search using token called Trapdoor
Symmetric searchable encryption
-Data Anonymization and Analytics as a Service (DAAS): anonimyze network telemetry data and ensures the ability to perform analytics on anonymized data
-Network Scanning Detection -

Direct Data fields that are considered sensitive and directly needed for analytics. For example internal IP addresses, MAC addresses etc. Anonymization of these fields should facilitate correlation across multiple homogeneous sources (different telemetry collectors, devices) and/or multiple heterogeneous sources (different logs like Web Server Logs, NetFlow records etc.)
-> Searchable Encryption: Data fields that are sensitive and
identified as directly needed for analytics are encrypted using
Searchable Encryption techniques.
-> Direct encryption results in finding frequency. Data fields identified as indirectly needed are encrypted using randomized techniques (like different IV’s or tweaks etc.)
For this we leverage the probabilistic nature of Searchable encryption along with Format Preserving encryption schemes in an onion layered
fashion similar to cryptdb.
2) Indirect These are data fields that are sensitive and not directly needed for analytics but needed to be part of final results or reports. For example few use cases may not need TIME STAMP of the flow for analytics but may be needed for further forensics in the Consumer network.
-> Data fields identified as indirectly needed are encrypted using randomized techniques (like different IV’s or tweaks etc.)
3) Public Data fields that are non sensitive in nature are left unchanged. Few data fields may need to be sanitized before sharing. For example, DST IP 8.8.8.8 can be left as is as the server is public.
4) Removal (or Nullifying) Data fields not needed either
for analytics should be nullified or removed completely
from the data records. For example protocol version.
Failing to do so may leak unwanted information for
adversaries to try inference attacks.

Onlion Layered Encryption ->
Format Preserving and Searchable : Ranking, SSE, LP-DET
SSE - Searchable: SWP scheme
FP DET: Ranking, LP-DET
DET: AES in ECB mode
RND: AES in CBC mode

Usage of Metrics:
1) NetFlow Anonymization - Used in Security Information and Event Management (SIEM) or Intrusion Prevention Systems(IPS)

Test Data:
We further split these data sets into chunks of approximately 10000 records each carefully ensuring each such data set contains traces of scanning activity.
-Total Flows are the number of flows (flowing inside and outside) the network
-Internal IP Addresses are the number of IP Addresses in the internal network. These are the ones being targeted during recon in Scan Detection use case and/or of interest to forensic investigators during metrics use case.
-Matching Flows are the number of flows that are outbound and matching with the Internal IP Addresses.

Paper-2: On the Optimization of Key Revocation Schemes for Network Telemetry Data Distribution

Attribute-based encryption is a type of public-key encryption in which the secret key of a user and the ciphertext are dependent upon attributes (e.g. the country in which he lives, or the kind of subscription he has). In such a system, the decryption of a ciphertext is possible only if the set of attributes of the user key matches the attributes of the ciphertext.

- Attribute-based encryption (ABE) can be used for log encryption. Instead of encrypting each part of a log with the keys of all ecipients, it is possible to encrypt the log only with attributes which match recipients' attributes. This primitive can also be used for broadcast encryption in order to decrease the number of keys used.

Challenges:
Although ABE concept is very powerful and a promising mechanism, ABE systems suffer mainly from two drawbacks: non-efficiency and non-existence of attribute revocation mechanism.

Other main challenges are:
Key coordination
Key escrow
Key revocation

ID-based encryption:
ID-based encryption, or identity-based encryption (IBE), is an important primitive of ID-based cryptography. As such it is a type of public-key encryption in which the public key of a user is some unique information about the identity of the user (e.g. a user's email address). This means that a sender who has access to the public parameters of the system can encrypt a message using e.g. the text-value of the receiver's name or email address as a key. The receiver obtains its decryption key from a central authority, which needs to be trusted as it generates secret keys for every user

Elliptic-curve cryptography (ECC) is an approach to public-key cryptography based on the algebraic structure of elliptic curves over finite fields. ECC requires smaller keys compared to non-ECC cryptography (based on plain Galois fields) to provide equivalent security.

For elliptic-curve-based protocols, it is assumed that finding the discrete logarithm of a random elliptic curve element with respect to a publicly known base point is infeasible: this is the "elliptic curve discrete logarithm problem" (ECDLP). The security of elliptic curve cryptography depends on the ability to compute a point multiplication and the inability to compute the multiplicand given the original and product points. The size of the elliptic curve determines the difficulty of the problem. 256-bit ECC key has security similar to 3072-bit RSA key

ABE

Attribute-based encryption (ABE) is a relatively recent approach that reconsiders the concept of public-key cryptography. In traditional public-key cryptography, a message is encrypted for a specific receiver using the receiver’s public-key. Identity-based cryptography and in particular identity-based encryption (IBE) changed the traditional understanding of public-key cryptography by allowing the public-key to be an arbitrary string, e.g., the email address of the receiver. ABE goes one step further and defines the identity not atomic but as a set of attributes, e.g., roles, and messages can be encrypted with respect to subsets of attributes (key-policy ABE - KP-ABE) or policies defined over a set of attributes (ciphertext-policy ABE - CP-ABE). The key issue is, that someone should only be able to decrypt a ciphertext if the person holds a key for "matching attributes" (more below) where user keys are always issued by some trusted party.

Ciphertext-Policy ABE

In ciphertext-policy attribute-based encryption (CP-ABE) a user’s private-key is associated with a set of attributes and a ciphertext specifies an access policy over a defined universe of attributes within the system. A user will be ale to decrypt a ciphertext, if and only if his attributes satisfy the policy of the respective ciphertext. Policies may be defined over attributes using conjunctions, disjunctions and (k,n)(k,n)-threshold gates, i.e., kk out of nn attributes have to be present (there may also be non-monotone access policies with additional negations and meanwhile there are also constructions for policies defined as arbitrary circuits). For instance, let us assume that the universe of attributes is defined to be {A,B,C,D}{A,B,C,D} and user 1 receives a key to attributes {A,B}{A,B} and user 2 to attribute {D}{D}. If a ciphertext is encrypted with respect to the policy (A∧C)∨D(A∧C)∨D, then user 2 will be able to decrypt, while user 1 will not be able to decrypt.

CP-ABE thus allows to realize implicit authorization, i.e., authorization is included into the encrypted data and only people who satisfy the associated policy can decrypt data. Another nice features is, that users can obtain their private keys after data has been encrypted with respect to policies. So data can be encrypted without knowledge of the actual set of users that will be able to decrypt, but only specifying the policy which allows to decrypt. Any future users that will be given a key with respect to attributes such that the policy can be satisfied will then be able to decrypt the data.

Key-Policy ABE

KP-ABE is the dual to CP-ABE in the sense that an access policy is encoded into the users secret key, e.g., (A∧C)∨D(A∧C)∨D, and a ciphertext is computed with respect to a set of attributes, e.g., {A,B}{A,B}. In this example the user would not be able to decrypt the ciphertext but would for instance be able to decrypt a ciphertext with respect to {A,C}{A,C}.

An important property which has to be achieved by both, CP- and KP-ABE is called collusion resistance. This basically means that it should not be possible for distinct users to "pool" their secret keys such that they could together decrypt a ciphertext that neither of them could decrypt on their own (which is achieved by independently randomizing users' secret keys).

Beyond ABE

ABE is just one type of the more general concept of functional encryption (FE) covering IBE, ABE and many other concepts such as inner product or hidden vector encryption (yielding e.g., searchable encryption) etc. It is a very active and young field of research and has many interesting applications (in particular in the field of cloud computing).
Source: https://crypto.stackexchange.com/questions/17893/what-is-attribute-based-encryption

Broadcast Encryption
The BE scheme uses bilinear group G of order p and take identities in range 1,...,n, n is number of users, r is revoked users. The scheme provides strong revocation mechansim requiring ciphertext to have O(r) elements and publlic, private keys of constant size. The scheme comprises Setup (n), Key-Gen(MSK, ID), Encrypt(PK, M, S) and Decrypt(S,CT,ID,D), if ID is in revocation list the algorithm aborts.

Optimizations in the scheme:
Right pairings and curves eg: SS512
1) Randomness reuse: ZR, G1, G2 - During encrypt operation, the random(ZR) is used in s1 variables and for each item in revocation list a random(ZR) is assigned.
2) Bilinear Pairing computation: During decrypt operation, the bilinear pairing is computed for pair(c[i],d[i]) where i range [1..7]. The pairing results are cached and reused while decryption of data for same user. The pairing computation is
done once per user.
3) Indexing: The group.random values for G1, G2 and ZR are precomputed and stored in buffer, the group.hash value is computed for each user in revoked list during encrypt and decrypt execution. S hash = group.hash(S[i].upper()) for each in revoked list S. An index of revoked userlist and corresponding hash values is used as this is used frequently during encryption and decryption operations.
4) Reuse of intermediate results: During decrypt, the A4 computation required considerable amount of time, calculating it requires encrypted-text, user-secret key and hash values for users in revoked list, in circumstances where data is encrypted to large set of users the value is computed on periodic basis.

SSL Handshake:
==========
1. Client initiates with Client Hello message, ciphers supported and random number#1
2. Server hello, cipher selected, random number#2, SSL cert
3. Client creates Pre-master secret
Pre-master secret encrypted with Server public key
4. Both client and server have Random nubmer#1,#2 and pre-master secret, generates Master secret
5. Client sends: SSL Finished
6. Server sends: SSL Finished

Paper: Experiments in Encrypted and Searchable Network Audit Logs
Used advances in Identity Based Encryption and Attribute Based Encryption techniques look promising in providing capabilities of privacy preserving search over encrypted data.

Privacy Preserving Audit:
Searchable Encryption techniques allow a client to store encrypted records at a remote server. Subsequently the client can Search the server using a token called TrapDoor. Server uses it in order to match the encrypted records and returns the matching ones.

Setup: The algorithm initialization depends on bilinear pairing and elliptic curve used. The key server then generates master key MK, and public key PK.

-> Selection of bilinear-pairing a bilinear group G0 of prime order p with generator g. We have used elliptic curve with bilinear maps (or pairings) like SS512 which is a symmetric curve with a 512-bit base field using this private key is generated and asymmetric curve pairings like MNT159 and MNT224 having 159-bit and 224-bit base field respectively.
-> Curve selection We used Type-A curve such as y2 = x3 + x to compute the pairings

Key generation and sharing: The user secret key is generated using (PK, MK, search keyword). Like in traditional CP-ABE scheme, attributes are associated with public and access policy is associated with ciphertext. Here instead in place of public identifiers the search keyword is used. The secret key is communicated to interested parties using a secure channel like TLS/SSL.

Encryption: The data records are read from SiLK repository. For each log entry m comprising search keywords w 1 , w 2 , ..w n (keywords could be ip-address, subnet-mask, protocol using which user would like to filter the data)

-> The server encrypts the entry using random symmetric encryption key K, to get E K (m). For each keyword wi, the server computes the CP-ABE encryption ci of string (F LAG|K) using search keyword as access policy and PK public key
-> The choice of symmetric encryption for data encryption is attributed to the fact that these exhibit high performance and more suitable while encrypting large data. We have used AES in CBC mode with 16 byte block size (with PKCS5 padding) and HMAC SHA1 algorithm as a PRP generator.

Match and Decrypt:
If the data owner wants to provide controlled access to third party auditor who wish to search and retrieve of particular data from encrypted records. The data owner with help of key authority constructs private key with capability, then for each encrypted record MatchAndDecrypt operation is run:
• As part of match routine the data record is decrypted using (PK, sk, ciphertext) and the decrypted text is if it has FLAG has prefix.
• The match returns true then decrypt the ciphertext c using previously generated secret-key sk and public key PK. The symmetric encryption key is extracted from decrypted text and one more round of decryption happens but this time it is done using symmetric key.
• If match is false then the record is not processed further.

Reverse Proxy Vs Forward Proxy:
Forward Proxy: Acting on behalf of a requestor (or service consumer)
Reverse Proxy: Acting on behalf of service/content producer.

Reverse Proxy Gateway Functionality
Security: Reverse Proxy Gateway acts as entry point and incase of secure Hypertext Transfer Protocol (HTTPS) requests it decrypts request contents and pass the decoded request content to backend servers which can handle only HTTP requests. Some advantages offered in security front are Secure Socket Layer (SSL/TLS) configuration which is CPU intensive.
Centralized Logging Service: As incoming requests are routed through reverse proxy gateway, it captures important events pertaining to traffic patterns and help in security monitoring service.
Load balancing: The reverse proxy load balances the incoming traffic to available servers based on availability and distribute requests using strategies like sticky session in case of stateful sessions, round robin selection of servers.
Caching and static content serving: This service is required for content heavy applications like Youtube where the server responsiveness is improved by hosting static contents in gateway server to improve access speed.
Gzip compression and decompression: In order to speed up data transfer between client-server the reverse proxy servers can compress data before request is served and uncompress the data uploaded by clients.

http://www.jscape.com/blog/bid/87783/Forward-Proxy-vs-Reverse-Proxy

Challenges associated with Traditional Data Sharing Approaches
In many cases the cloud service provider need to share sensitive data with cloud users. This kind of data sharing can happen periodically at stipulated intervals or on-demand whenever some security investigation is necessary. With traditional security schemes such as PKI there are certain challenges which tend to become a bottleneck when used in cloud environment.
1. Certificate Management: The existing mechanisms for secure content sharing largely relay on secure socket layers which use certificates for trust establishment. Certificate management involves validation of certificates and frequent synchronization with certificate authority servers.
2. Validity of the certificate: The CA publishes the certificate status information which client applications can validate. This information is published periodically which clients synchronize with server.
Receiver need to verify that sender certificate not revoked. The certificate status information is queried using Certificate revocation list (CRL) or Online certificate status protocol (OCSP).
3. How to trust Certificate Authority: Both sender and receiver may have different CA. This lead to CA chain certification or validation even for CAs.
4. Fine-grained Access Control: The existing security mechanisms provide either complete access to data or completely restrict usage and they do not provide easy way to share data selectively with other parties.

Paper-3: : Improved Data Confidentiality of Audit Trail Data in Multi-Tenant Cloud

Typically the cloud providers have a demilitarized zone protecting the data center along with a reverse proxy setup. The reverse proxy gateway acts as initial access point and provides additional capabilities like load balancing, caching, security monitoring capturing events, syslogs related to hosts residing in the cloud. The audit-trail logs captured by reverse proxy server comprise important information related to all the tenants.

We provide a two-phase approach for sharing the audit-logs with users allowing fine-grained access. In this paper we evaluate certain Identity-Based and Attribute-Based Encryption schemes and provide detailed analysis on performance.

The internet facing reverse proxy gateway provides protection from issues like intrusion detection, denial of service attacks etc. Data collected by reverse proxy includes system logs, alarms and it can capture HTTP/REST requests, remote-service calls pertaining to tenants if it is configured as SSL termination end-point.

Audit trail log structure:
http {
log_format compression '$remote_addr - $remote_user [$time_local] ' '"$request" $status $body_bytes_sent ' '"$http_referer" "$http_user_agent";
}

Methodology: We consider role of reverse proxy server extended as SSL termination end-point so that it can intercept all HTTP/SSL traffic. The cloud provider has a Network Admin who has access to entire logs and cloud tenants with users having different roles like level-1, level-2, level-3 etc. While level-1 users are in the bottom of organizational hierarchy and they are monitored by level-2 and so on and so forth.

METHODOLOGY
Consider public cloud provider hosting tenants having reverse proxy server installed which captures audit-trail logs of incoming traffic pertaining to clients. We consider role of reverse proxy server extended as SSL termination end-point so that it can intercept all HTTP/SSL traffic. The cloud provider has a Network Admin who has access to entire logs and cloud tenants with users having different roles like level-1, level-2, level-3 etc. While level-1 users are in the bottom of
organizational hierarchy and they are monitored by level-2 and so on and so forth.
A. Privacy and Security of Audit logs - Objectives We divide the problem into two sub-domains –
1. Cloud Network Admin has access control on entire logs and can do operations like search, encryption, decryption,
2. Tenant users like Network Admin can access all tenant specific logs and users of Level-1, Level-2 etc. has controlled access to data.
Users at higher level can oversee data pertaining to lower level that they are administering. It implies that user’s access to audit log contents is controlled using role-based access control policies.

Type-I Data Security
The Cloud Network Administrator being a super user has complete control on data. Log entries have unique attributes like TenantId, application-id or any public identifiers. The cloud provider repository contains data pertaining to all the tenants, the users can access content based on access restrictions implied by associated access structure according to RBAC policies. Type-I data protection scheme uses IBE or ID-PKC algorithm such as BB scheme. The Phase-1 uses ID-PKC scheme as it facilitates communication using public keys generated out of public identifiers like user-id, organization name, pin-code which provide advantage of less overhead in communication. We explore BB, BB-CKRS schemes for Type-1 data security implementation. For performance evaluation we evaluate combination of these schemes and assess those using large datasets in our experiments.
Type-II Data Security
Data security implemented using Type-II mechanism imposes access control restrictions on specific datasets or particular fields of a dataset tuple shared among the users. User has access to data based on access restrictions as per the access structure embedded in ciphertext. As an example the Level-1 users who are bottom of hierarchy can see data pertaining to their own activity and Level-2 can escrow or oversee all the Level-1 user data along with additional data pertaining to its activity. We modify the existing CP-ABE along with encrypted text with embedded role-based policy or access control data and user keys having descriptive attributes such as organization, division, manager. The main reason for choosing CP-ABE for Type-II security implementation is that it perfectly suits circumstances where user privileges (RBAC policies) determine the access and ensures granular access on data. For Type-II data security we assess BSW and Waters cryptographic schemes.
We use Key encapsulation mechanism or Symmetric Crypto Abstraction for content encryption as used in practical web-based applications. Symmetric Crypto Abstraction is conglomeration of symmetric and asymmetric schemes for faster data encryption, this approach is used in HTTPS or secures internet communication in real world applications. The reason for choosing symmetric key algorithm for large data encryption is because they exhibit higher performance which is useful for real-time communications. Only the key parameters like symmetric secret key or session-Id is encrypted using the asymmetric key algorithm, in this case it is Type-1 or Type-2 security scheme. So for practical applications with requirement for large data encryption a symmetric key algorithm like Advanced Encryption Standard (AES) scheme with CBC mode having 128-bit security is used, the key size of 128-bit is sufficient but 192 or 256-bit is desirable.

Following are two use cases we can foresee in content sharing in a cloud scenario
Cloud service provider use Type-1 scheme for content encryption and tenants or individual users use a proxy service to interpret the data and re-encrypt using policy data or RBAC information of local users using Type-2 scheme
Alternatively the cloud service provider use Type-2 security mechanism with RBAC policy tree (access structure) as input for content encryption and then re-encrypt using Type-1 scheme. The consumers or tenant users initially decrypt contents using secret keys of Type-1 mechanism and apply one more round of decryption using Type-2 scheme.

Mission New Farm

Wednesday, February 7, 2018

Papers published and brief overview

Paper-1: Experimental Evaluation of Network Telemetry Anonymization for Cloud Based Security Analysis

Paper-2: On the Optimization of Key Revocation Schemes for Network Telemetry Data Distribution

Paper-3: : Improved Data Confidentiality of Audit Trail Data in Multi-Tenant Cloud

No comments:

Post a Comment

About Me

Blog Archive

Followers