Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Secure your Intelligent Applications with Micro...

Secure your Intelligent Applications with Microsoft Entra

Learn about adding user authentication and access control to AI and RAG applications using Azure and Microsof Entra.

Presented at Microsoft Build 2024. Recording available from
https://build.microsoft.com/en-US/sessions/b5636ca7-64c2-493c-9b30-4a35852acfbe?source=sessions

Pamela Fox

May 23, 2024
Tweet

More Decks by Pamela Fox

Other Decks in Technology

Transcript

  1. Secure your Intelligent Applications with Microsoft Entra Den Delimarsky, Microsoft

    Entra Matthew Gotteiner, Azure AI Search Pamela Fox, Python Cloud Advocacy
  2. How do Authentication and Authorization Work? Authenticate users through the

    Open ID Connect protocol (OIDC) Authorize users using OAuth 2.0 protocol Terminology: Auth Flow ▪ Authentication / Authorization Exchange Authorization Server ▪ Issues tokens for apps to access protected resources Client ▪ App requesting access to a protected resource Resource Owner ▪ Owns protected resource client is trying to access Resource Server ▪ Provides access to protected data ▪ Relies on authorization server for authentication, uses token for authorization
  3. What kinds of clients are there? Single-page Application (SPA) Web

    apps where tokens are acquired by a single-page application Public Client Application Covers other types of web apps, desktop apps, mobile apps, and even apps that run on devices without a browser that need to sign in users Confidential Client Application Apps that don't need to sign in a user
  4. The risks of using API keys to access Azure services

    • API keys can be easily leaked • API keys can be passed around a company (unintentionally) • API keys can be painful to rotate app.py client = openai.AzureOpenAI( api_version="2024-02-15-preview", azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"), api_key=os.getenv("AZURE_OPENAI_KEY") )
  5. Accessing Azure services with managed identity Azure App Service System

    identity Azure OpenAI Option 1 Azure App Service User-assigned identity Azure OpenAI Option 2
  6. Using Microsoft Entra ID for API authentication: Step 1 •

    Give role-based access control to users or applications • Use managed identities for deployed applications (system or user-assigned) • Declare the roles in Bicep, a declarative language for infrastructure-as-code main.bicep roleDefinitionId = '5e0bd9bd-7b93-4f28-af87-19fc36ad61bd' // Cognitive Services OpenAI User resource role 'Microsoft.Authorization/roleAssignments@2022-04-01' = { name: "YOUR-UNIQUE-NAME" properties: { principalId: appIdentityId // Client ID of the Identity principalType: 'ServicePrincipal' roleDefinitionId: resourceId('Microsoft.Authorization/roleDefinitions', roleDefinitionId) } }
  7. Using Microsoft Entra ID for API authentication: Step 2 •

    Use the azure-identity SDK to get a token provider for your identity • Pass the token provider to the OpenAI SDK • Token providers take care of token refresh for you! app.py azure_credential = DefaultAzureCredential( managed_identity_client_id=os.getenv("APP_IDENTITY_ID")) token_provider = get_bearer_token_provider(azure_credential "https://cognitiveservices.azure.com/.default") client = AzureOpenAI( api_version="2024-02-15-preview", azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"), azure_ad_token_provider=token_provider )
  8. Deploying with the Azure Developer CLI The azd up command

    provisions Azure resources and roles based off infrastructure-as-code (Bicep) files, then deploys application code to hosts. >>> azd up Packaging services (azd package) (✓) Done: Packaging service web Provisioning Azure resources (azd provision) Subscription: ca-pamelafox-demo-test (32ea8a26-5b40-4838-b6cb-be5c89a57c16) Location: East US 2 (✓) Done: Resource group: ragpostgres-rg (✓) Done: Azure OpenAI: ragpostgres-ndj764e3jrpxi-openai (✓) Done: Container App: ragpostgres-ndj764e-ca Deploying services (azd deploy) (✓) Done: Deploying service web - Endpoint: https://ragpostgres-ndj764e-ca.whiteglacier-476a7757.eastus2.azurecontainerapps.io/
  9. Registering with the Microsoft identity platform To request tokens from

    the Microsoft identity platform, you need to register a Microsoft Entra application and create a service principal for it. Microsoft Entra Application Object Microsoft Graph Service Principal Microsoft identity platform
  10. Registering Entra applications with Graph SDK graph_client = GraphServiceClient(credentials=credential, scopes=scopes)

    graph_client.applications.post(Application( display_name=f"ChatGPT Sample Client App {identifier}", sign_in_audience="AzureADMyOrg", web=WebApplication( redirect_uris=["http://YOUR-APP-URL/.auth/login/aad/callback"], implicit_grant_settings=ImplicitGrantSettings(enable_id_token_issuance=True)), required_resource_access=[ RequiredResourceAccess( resource_app_id="00000003-0000-0000-c000-000000000000", resource_access=[ ResourceAccess(id="e1fe6dd8-ba31-4d61-89e7-88639da4683d", type="Scope"), # Graph User.Read ResourceAccess(id="7427e0e9-2fba-42fe-b0c0-848c9e6a8182", type="Scope"), # offline_access ResourceAccess(id="37f7f235-527c-4136-accd-4a02d197296e", type="Scope"), # openid ResourceAccess(id="14dad69e-099b-42c9-810b-d002981feec1", type="Scope"), # profile ])])) Graph SDKs available in C#, Go, Java, JavaScript, PHP, Powershell, Python auth_init.py aka.ms/azai/auth-builtin
  11. Setting Entra application credentials with Graph SDK request_password = AddPasswordPostRequestBody(

    password_credential=PasswordCredential(display_name="WebAppSecret"), ) graph_client.applications.by_application_id(app_id) .add_password.post(request_password) Currently, app registrations can use either password or certificate credentials. (Stay tuned for a better way!) auth_init.py aka.ms/azai/auth-builtin
  12. OAuth2 authentication flow with OIDC App backend Microsoft Entra servers

    Browser OAuth2 Leg 1 Initiate the authorization code flow Returns authorization URI User Signs in Returns redirect to redirectURI OAuth2 Leg 2 Exchange authorization code for token Returns redirect to URI Visits webapp Returns access token Render webpage &scope=openid email name and ID token
  13. Implementing the authentication flow Option 1: For auth on Azure

    App Service or Container Apps Option 2: For auth on any host (including local) Use MSAL packages to orchestrate OIDC flow using app registration Configure built-in authentication and authorization with Microsoft identity platform as the provider
  14. Configuring built-in authentication for App Service • Set clientID to

    the app ID of the Entra app registration • Store app password in secrets and point clientSecretSettingName to that secret • Set openIdIssuer to the Microsoft idP endpoint var loginEndpoint = environment().authentication.loginEndpoint var openIdIssuer = '${loginEndpoint}${tenant().tenantId}/v2.0' resource auth 'Microsoft.App/containerApps/authConfigs@2023-05-01' = { parent: app name: 'current' properties: { platform: { enabled: true } globalValidation: { redirectToProvider: 'azureactivedirectory' unauthenticatedClientAction: 'RedirectToLoginPage' } identityProviders: { azureActiveDirectory: { registration: { clientId: clientId clientSecretSettingName: clientSecretName openIdIssuer: openIdIssuer } } } } } appauth.bicep aka.ms/azai/auth-builtin
  15. Extracting user details from headers def extract_username(headers, default_username="You"): if "X-MS-CLIENT-PRINCIPAL"

    not in headers: return default_username token = json.loads(base64.b64decode(headers.get("X-MS-CLIENT-PRINCIPAL"))) claims = {claim["typ"]: claim["val"] for claim in token["claims"]} return claims.get("name", default_username) @app.get("/") async def index(): username = extract_username(request.headers) return await render_template("index.html", username=username) https://learn.microsoft.com/azure/app-service/configure-authentication-user-identities The built-in authentication service injects headers into request headers with details about the authenticated user, like their username. app.py aka.ms/azai/auth-builtin
  16. OAuth2 authentication flow App backend Microsoft Entra servers Browser OAuth2

    Leg 1 Initiate the authorization code flow Returns authorization URI User Signs in Returns redirect to redirectURI OAuth2 Leg 2 Exchange authorization code for token Returns redirect to URI Visits webapp Returns access token, ID token Render user details
  17. Using the Python MSAL SDK for authentication flows app =

    msal.ConfidentialClientApplication( os.getenv("CLIENT_ID"), client_credential=os.getenv("CLIENT_SECRET"), authority=f"https://login.microsoftonline.com/{os.getenv('TENANT_ID')}", ) flow = app.initiate_auth_code_flow(scopes, redirect_uri=redirect_uri) # Redirect the user to the URI returned by that function ^ ... result = app.acquire_token_by_auth_code_flow(auth_flow, auth_response) access_token = result["access_token"] Configure the client ID and client credentials according to what your provisioned, then call the correct methods to generate the correct authentication URI and exchange the authorization code for a token. LEG 1 LEG 2
  18. Using the identity package for easier auth flows app =

    Quart("chatapp") auth = Auth(app, redirect_uri=os.getenv("REDIRECT_URI"), client_id=os.getenv("CLIENT_ID"), client_credential=os.getenv("CLIENT_SECRET"), authority=os.getenv("AUTHORITY") ) @app.route("/") @auth.login_required async def index(*, context): return render_template('index.html', user=context['user']) Use the open-source identity package to use MSAL with popular Python frameworks. Just initialize the Auth object and add decorators to routes that require login. https://pypi.org/project/identity/ app.py aka.ms/azai/auth-local
  19. Registering Entra applications with Graph Bicep template resource clientApp 'Microsoft.Graph/[email protected]'

    = { uniqueName: clientAppName displayName: clientAppDisplayName signInAudience: 'AzureADMyOrg' web: { redirectUris: ['${webAppEndpoint}/.auth/login/aad/callback'] implicitGrantSettings: {enableIdTokenIssuance: true}} requiredResourceAccess: [{ resourceAppId: '00000003-0000-0000-c000-000000000000' resourceAccess: [ // User.Read {id: 'e1fe6dd8-ba31-4d61-89e7-88639da4683d', type: 'Scope'} // offline_access {id: '7427e0e9-2fba-42fe-b0c0-848c9e6a8182', type: 'Scope'} // openid {id: '37f7f235-527c-4136-accd-4a02d197296e', type: 'Scope'} // profile {id: '14dad69e-099b-42c9-810b-d002981feec1', type: 'Scope'} ]} ]} resource clientSp 'Microsoft.Graph/servicePrincipals@beta' = { appId: clientApp.appId } Create a Graph application and associated service principal in Bicep (vs. SDK) Public preview appreg.bicep https://aka.ms/graphbicep aka.ms/graph-bicep-mi-fic
  20. Using managed identity as federated identity credential var openIdIssuer =

    '${loginEndpoint}${tenant().tenantId}/v2.0' resource webIdentity 'Microsoft.ManagedIdentity/userAssignedIdentities@2023-01-31' = { name: '${name}-id' location: location } resource clientAppFic 'federatedIdentityCredentials@beta' = { name: '${clientApp.uniqueName}/msiAsFic' audiences: ['api://AzureADTokenExchange'] issuer: openIdIssuer subject: webIdentity.properties.principalId } App registrations can go password-less! More secure than secrets/certificates since no strings need to be stored securely or rotated. This will be eventually supported by Graph Bicep, MSAL SDKs, and Built-in Auth. appreg.bicep aka.ms/graph-bicep-mi-fic
  21. Configuring built-in authentication for App Service Just change clientSecretSettingName to

    this exact value instead → var loginEndpoint = environment().authentication.loginEndpoint var openIdIssuer = '${loginEndpoint}${tenant().tenantId}/v2.0' resource configAuth 'Microsoft.Web/sites/config@2022-03-01' = { parent: appService name: 'authsettingsV2' properties: { globalValidation: { requireAuthentication: true unauthenticatedClientAction: 'RedirectToLoginPage' redirectToProvider: 'azureactivedirectory' } identityProviders: { azureActiveDirectory: { enabled: true registration: { clientId: clientId clientSecretSettingName: 'OVERRIDE_USE_MI_FIC_ASSERTION_CLIENTID' openIdIssuer: openIdIssuer }}} login: { tokenStore: { enabled: true } }}} appauth.bicep aka.ms/graph-bicep-mi-fic
  22. Configuring MSAL constructors import msal, requests mi = msal.ManagedIdentityClient( msal.UserAssignedManagedIdentity(client_id="guid"),

    http_client=requests.Session(), token_cache=msal.TokenCache(), ) result = mi.acquire_token_for_client(resource="resource_abc") When using MSAL SDKs, you'll be able to use an identity as the credential. This will only work when a managed identity is available – not locally!
  23. Understanding token claims { "aud": "https://management.core.windows.net/", // Token Audience (Resource

    Server) "iss": "https://sts.windows.net/f6a799a2-eb93-4e7f-9515-19e4a2e7af04/", // Token Issuer "iat": 1714775919, // Issued at time "nbf": 1714775919, // Do not process token before this time "exp": 1714780517, // Expiry time "name": "Matt G", // Display name of the user "oid": "8d5a813e-af85-47f1-b076-0b88e9cf8443", // Object identifier of the user "groups": ["b415f9c9-4f20-45b4-87a1-0ac9a142f0c5"], // Identifiers of user groups "scp": "user_impersonation" // OAuth 2.0 scopes that have been consented to } Access tokens use the JSON Web Tokens (JWT) format. Claims, or key-value pairs, establish facts about the subject the token was issued for. Try decoding a token yourself at https://jwt.ms.
  24. Representing access control in AI Search indexes Search supports string

    collection fields. Directly map object and group identifiers from token claims to documents in the index. { "name": "index-with-access-control", "fields": [ { "name": "key", "type": "Edm.String", "key": true }, { "name": "oids", "type": "Collection(Edm.String)", "filterable": true }, { "name": "groups", "type": "Collection(Edm.String)", "filterable": true } ] }
  25. Access control using filtering AI Search supports filtering documents in

    addition to normal searches. Efficiently search thousands of unique identifiers using "search.in". { "search": "document text" , "filter": "oids/any(oid: search.in(oid, '3fd9a875-2e3d-4b97-8301-eb7b7e6a109e, a11be098-87b6- 4c68-af19-79e44d927c4d, ...')) or groups/any(group: search.in(group, 'e432e4cd-8e1c-4a5e-9c0a- 6e1fa3a6bb8d, 6f091fd9-5871-4d1b-8fd5-3dbef48b52a9, ...')" }
  26. Updating access control associated with a document AI Search supports

    incremental updates to individual records Include document key, "merge" action, and fields to update { "value": [ { "@search.action": "merge", "key": "my-document-key", "oids": [ "c0f84485-7814-49b2-9128-9b3a5369c423", "7dc3d6e8-8d6b-4ae4-b288-8d50d605df55" ], "groups": [ "f2b17199-8ec8-41b0-b0d7-1a6ad597f96e", "e5e0b705-993b-4880-81c8-3b0a3f7345f7" ] } ] }
  27. Combining AI Search and Data Lake Gen2 Storage Data Lake

    Gen2 Storage allows associating access control information with files and folders
  28. Fetching access control information Fetch access control list from files

    or directories Need to parse string to find exact group and object ids from azure.storage.filedatalake import DataLakeServiceClient from azure.identity import DefaultAzureCredential service_client = DataLakeServiceClient(account_url="https://account.dfs.core.windows.net", credential=DefaultAzureCredential()) file_system_client = service_client.get_file_system_client("container") file_client = file_system_client.get_file_client("My Documents/notes.txt") # Request ACLs as GUIDs by setting user principal name to false access_control = file_client.get_access_control(upn=False) acl = access_control["acl"] # ACL Format: user::rwx,group::r-x,other::r--,user:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx:r-- acl_list = acl.split(",")
  29. Lifecycle of data with access control information Option 1: Ingest

    the documents from a data source with access control information
  30. Lifecycle of data with access control information Option 2: Ingest

    the documents from a data source without access control and join them with access control information
  31. Why not just use an index per user? • AI

    Search limitations mean you have a finite number of indexes per search service • S3 HD index max size is 100GB Tier Free Basic S1 S2 S3 S3 HD L1 L2 Maximum indexes 3 15 50 200 200 1000 per partition, max 3000 per service 10 10
  32. Additional security considerations • Use managed identity to access all

    Azure services https://aka.ms/oai/keyless azure_credential = DefaultAzureCredential() token_provider = get_bearer_token_provider(azure_credential, "https://cognitiveservices.azure.com/.default") client = AzureOpenAI( api_version="2024-02-15-preview", azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"), azure_ad_token_provider=token_provider ) • Use a virtual network and private endpoint to isolate your AI apps https://aka.ms/azai/pvt
  33. Get started with our samples Azure OpenAI + AI Search

    + Entra + MSAL + App Service Built-in Auth aka.ms/ragchat Azure OpenAI + Entra + Container Apps Built-in Auth aka.ms/azai/auth-builtin Azure OpenAI + Entra + MSAL + Identity package aka.ms/azai/auth-local Find more samples at: aka.ms/azai Java JavaScript Python .NET OpenAI Assistants Fine-tuning ...and more!
  34. How you are using Generative AI today and what are

    your challenges around securing it? Continue the conversation at the Microsoft Security Community Expert Meetup in the Ballroom on level 5. Join the Tech Community & Security Hub by following this QR code
  35. Session Resources Next steps Resources Related sessions Find all of

    this and more on the session details page aka.ms/build/seattle/sessions